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A  BST3  ACT 


This  thesis  examines  the  network  management  functions 
required  for  a  local  computer  network.  Initially,  general 
management  considerations  are  addressed.  These  include: 
problem  determination,  performance  analysis,  problem  manage* 
meat,  change  management,  configuration  management,  and 
operations  management.  The  sidestream,  mainstream,  central¬ 
ized,  decentralized,  and  hybrid  network  monitoring 
technologies  are  then  discussed.  An  investigation  of 
network  measurement  tool3  and  their  use  in  generating 
management  reports  is  undertaken.  The  topics  of  analysis 
timing,  performance  measure  utilization,  and  parameter 
selection  are  considered.  Procedures  for  detecting,  diag¬ 
nosing  and  correcting  network  component  failures  are 
presented.  Solutions  are  proposed  for  problems  associated 
with  managing  a  local  computer  network-long  haul  network 
interface.  Finally,  a  discussion  of  the  mission,  objec¬ 
tives,  and  responsibilities  of  a  local  computer  network 
central  monitoring  site  is  undertaken. 
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i.  UISQfiSSII2S 


One  of  the  major  objectives  of  any  local  network  is  to 
provide  reliable  communications  facilities,  reflected  both 
in  the  continued  availabiLity  of  the  network  itself  and  in 
the  lowest  possible  error  rate  as  seen  by  individual 
processes  [Ref.  14:  p.  713].  To  this  we  would  add  the 
requirements  of  high  capacity  and  minimal  end-to-end  delay 
experienced  by  the  user.  Me  now  submit  what  we  feel  is  a 
responsible  and  complete  definition  of  network  management. 
Our  definition  includes:  collection  of  measurements  and 
subsequent  statistics  generation,  hardware  and  software 
failure  detection,  diagnosis  and  correction,  network 
performance  analysis,  and  network  parameter  adjustment. 

One  school  of  thought  advocates  management  of  local  area 
computer  networks,  while  another  feels  that  management,  as 
we  have  defined  it,  is  not'  required.  Me  support  the  former 
of  the  two.  The  benefits  to  be  gained  from  the  management 
of  a  local  computer  network  are  numerous.  Me  are  able  to 
reduce  the  impact  of  failures  and  increase  network  avail¬ 
ability  by  detecting,  diagnosing,  and  correcting  hardware 
and  software  problems  very  quickly.  Control  and  monitoring 
technologies  allow  network  operators  to  anticipate  problems. 
Rather  than  reacting,  operators  are  able  to  analyze  problems 
and  take  appropriate  action  to  minimize  them,  or  even 
preclude  their  occurence.  Management  of  a  local  computer 
network  gives  us  the  ability  to  provide  for  capacity  plan¬ 
ning,  manage  the  growth  of  the  network,  control  costs,  and 
eliminate  redundant  or  unused  capacity.  Me  can  also  improve 
the  networks  performance  and  its  availability  to  users  by 
monitoring  the  network  components  and  through  evaluation  of 
the  network  as  a  whole. 
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It  is  the  author’s  latent  to  identify  and  discuss  the 
tools  required  by  network  management  for  the  attainment  of 
these  and  other  benefits.  Be  will  begin  by  describing  a 
SPLICE  local  area  computer  network,  followed  by  a  discussion 
of  six  network  manageieat  disciplines.  Chapter  2  will 
present  various  network  aonitoring  aethodologies  and  tech¬ 
nologies.  In  Chapter  3,  we  enter  into  a  discussion  of  the 
measurement  tools  available  to  the  operator  and  suggest  ten 
management  reports  to  be  generated  from  collected  data. 
Chapter  u  provides  information  on  analysis  timing,  network 
performance  measure  utilization  and  parameter  selection,  and 
on  component  failure  detection,  diagnosis,  and  notification. 
Chapter  5  identifies  and  suggests  solutions  for  the  problems 
associated  with  managing  the  LAN/DDN  interface.  In  Chapter 
6,  we  conclude  with  a  discission  of  ohe  mission,  objectives, 
and  responsibilities  of  a  LAN  centraL  monitoring  site. 

A.  ASSOMPTIONS 

To  productively  discuss  the  topic  of  network  management,  ' 
it  is  important  That  i  common  base  of  understanding 
concerning  the  SPLICE  (Stock  Point  Logistics  Integrated 
Communications  Environment!  local  area  computer  network  be 
established.  This  section  bcisfLy  describes  the  Network 
Layer  Protocol  proposed  for  the  SPLICE  LAN.  This  discussion 
will  include;  error  detection,  packet  acknowledgement, 
collision  detection,  access  control,  bus  control,  retran¬ 
smission  technique,  and  picket  format.  Additionally,  the 
network  topology  and  physical  transmission  medium  will  be 
identified. Final ly,  a  brief  description  of  the  proposed  End 
to  End  Protocol  will  be  discussed.  A  more  detailed  explana¬ 
tion  of  the  SPLICE  concept  can  be  found  in  [Ref.  1  ]. 


ID 
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Network  Topology  aid  Transmission  Median 

The  Ring  ,  Star,  Unstructured,  and  Global  Bus  topolo¬ 
gies  were  discussed  in  [Ref.  2].  Primary  considerations 
made  during  the  selection  of  a  topology  were  it’s  flexi¬ 
bility,  reliability  and  simplicity.  Understanding  that  the 
structure  of  the  network  Bust  lend  itself  to  change  and 
reconfiguration,  one  author  [Ref.  2:  p.21]  recommended  that 
a  global  bus  topology  be  adopted  for  the  3PLICE  local 
computer  network. 

although  a  number  of  transmission  tediums  were 
discussed  in  [Ref.  2],  no  particular  technology  was  recom¬ 
mended  for  all  SPLICE  network  configurations.  For  this 
discussion  of  network  management,  it  will  be  assumed  that 
the  transmission  medium  is  coaxial  cnbLa  and  that  a  baseband 
technology  is  being  utilized. 

2*  Network  Layer  Proto  col 

Decentralized  control  of  the  bus  is  the  preois  upon 
which  all  subsequent  characteristics  are  based.  Nodes 
access  the  network  utilizing  a  random  access  contention 
mechanism  with  collision  detection  (ESMA/CD).  Error  detec¬ 
tion  is  accomplished  through  the  use  of  a  cyclic  redundancy 
checksum.  The  acknowledgement  for  a  packet  successfully 
received  is  undertaken  by  either  sending  a  special  acknowl¬ 
edgement  packet  or  by  including  the  acknowledgement  with  a 
data  packet  bound  for  the  appropriate  node.  (Jpon  detection 
of  a  collision,  a  node  implements  an  adaptive  binary  expo¬ 
nential  backoff  retransmission  technique.  Finally  there 
exists  a  single  packet  format  for  both  data  and  control 
information,  the  specific  type  being  identified  in  the 
packet  type  field  [Ref.  2:  p.  53]. 
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3.  Eni  ta  sm  Eiot^c ii 


TCP  was  utilized  as  a  basis  from  which  to  develop 
ths  transport  protocol.  Justification  for  the  use  of  TCP 
can  be  found  in  CBef.  3].  A  major  consideration  during  the 
design  of  an  end  to  end  protocol  was  the  assumption  that 
SPLICE  LAN'S  would  be  connected  to  each  other  through  the 
Defense  Data  Network.  The  fact  that  the  end  to  end  protocol 
currently  planned  for  the  DDN  is  TCP  further  accentuates  the 
benefits  to  be  derived  oy  having  a  TCP  based  transport 
protocol.  Investigation  shows  that  if  TCP  is  used  in  the 
strictest  sense  without  any  modification  as  the  local  trans¬ 
port  control  protocol,  simple  internetwork  communication 
will  be  achieved  at  the  expense  of  suboptimal  intranetwork 
performance  [Bef.  2:  p«  73]. 

B.  LAN  ARCHITECTOHE 

This  section  depicts  and  briefly  describes  the  logical 
ar.d  physical  views  cf  the  SPLICE  LAN.  These  diagrams  are 
included  in  order  to  provide  a  visual  representation  which 
may  be  referred  to  during  the  discussion  of  network  manage¬ 
ment  throughout  the  thesis. 

1.  Logal  Network  Logical  7iew 

The  six  boxes  along  the  top  of  figure  1.1  are  iden¬ 
tified  as  operation  functions  implemented  in  software 
modules.  The  three  boxes  to  the  right  and  in  the  middle 
represent  support  functions  implemented  in  software  modules. 
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Figure  1.1  Local  Network  Logical  View 


Cootrol  messages  flow  aloog  the  logical  control  bus  while 
data  massages  flow  along  the  logical  data  bus.  A  sore 
detailed  explanation  of  these  functional  modules  can  be 
found  in  [Ref.  4]. 

2.  Local  Network  Ekilicai  £ifw 

There  exists  only  one  physical  bus  upon  which  will 
flow  both  control  and  data  messages.  The  functions  identi¬ 
fied  in  the  logical  view  of  the  network  have  been  assigned 
to  specific  minicomputers.  As  can  be  seen  in  figure  1.2  , 
the  network  management  function  has  not  been  identified. 
Theoretically,  this  function  could  reside  in  one  or  all  of 
the  network  nodes.  An  iniepth  discussion  of  this  topic  will 
be  undertaken  in  Chapter  2. 

C.  NETWORK  HAHAGEHEHT  DISCIPLINES 

If  viewed  as  a  single  module,  the  network  management 
function  appears  guite  complex.  Different  aspects  of  the 
function  appear  to  overlap,  while  others  appear  to  be 
disjoint  and  unrelated.  In  order  to  more  effectively 
analyze  the  various  aspects  of  the  network  management  func¬ 
tion,  a  disag grea gat ion  of  the  function  into  unique, 
identifiable  modules  is  undertaken.  Freeman  proposes  six 
distinct  management  disciplines  associated  with  managing  the 
components  of  a  computer  network  jRef.  5:  p.  91].  These 
disciplines  include;  problem  determination,  performance 
analysis,  problem  management,  change  management,  configura¬ 
tion  management,  and  operations  management.  The  purpose  for 
presenting  these  disciplines  is  twofold;  First,  to  create 
more  managable  and  understable  modules  through  which  the 
concept  of  network  management  can  be  discussed,  ana  second, 
to  provide  a  foundation  upon  which  various  network  manage¬ 
ment  techniques  can  be  analyzed  throughout  the  thesis.  Each 


Figure  1.2  Local  Network  Physical  Flaw. 
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of  these  disciplines  will  be  briefly  described  in  the 
following  sections. 

1-  R£o£lS£  2Si££J5iai.tion 

Problae  determination  is  tie  process  of  identifying 
a  failing  or  down  component  of  the  network  so  that  correc¬ 
tive  action  aay  be  taken.  It  includes;  awareness  that  a 
problem  exists,  isolation  of  the  problem  to  a  particular 
element,  identification  of  what  caused  the  problem,  and 
determination  of  the  correct  organization,  individual,  or 
vendor  who  is  responsible  for  the  correction  of  that  specfic 
type  of  problem. 

2.  Per forma  nee  Analyst  s 

Peformanca  Analysis  deals  with  quar.ti  fiably 
answering  the  question  of,  ’Sow  wall  is  the  network  doing 
what  it  is  supposed  to  do?'.  It  provides  for  the  measure¬ 
ment  of  certain  dependent  variables  throughout  the  network. 
These  measurements  are  then  compare!  to  criteria  that  have 
been  previously  established  by  some  ether  means  (e.g.  by 
mathematical  models)  .  3y  observing  the  variance  between 
these  figures,  a  snapshot  of  the  network's  performance  can 
be  obtained  for  that  particular  instant  in  time.  A  number 
of  variables  measured  can  be  classified  as  "tuning"  statis¬ 
tics.  Certain  parameters  exist  which  can  be  adjusted  by 
network  operations  personnel  in  order  to  effect  the  values 
of  these  tuning  statistics.  In  this  way,  we  can  affect  both 
network  performance  and  the  quality  of  the  service  provided 
by  the  network  as  perceived  by  the  user. 

3.  Eiaiiafl  aaaiasflsut 

Problem  Management  concerns  the  reporting,  tracking, 
and  resolution  of  problems  that  affect  a  user's  or  process' 
capability  to  communicate  with  any  other  user  or  process. 


15 


Establishment  and  maintenance  of  a  problem  database  can  be 
accomplished  in  a  number  of  nays.  Problems  lay  be  docu¬ 
mented  manually  utilizing  pencil  and  paper.  They  may  be 
recorded  semi-a utoaaticall y  through  manual  entry  into  a 
database.  Or,  problems  may  be  recorded  automatically 
through  the  interaction  of  the  problem  management  module 
with  the  problem  determination  and  performance  analysis 
modules.  The  method  chosen  through  which  network  problems 
will  be  recorded  should  provide  for  data  consistency,  real 
tii9  information  or  nearly  so,  user  accessibility,  and 
minimal  operations  personnel  anvolvament.  A  list  of 
possible  entries  for  inclusion  in  a  problem  record  is 
provided  in  Appendix  A. 

4.  Change  M anagement 

Changes  made  to  a  network  component  that  are  not 
promulgated  throughout,  or  made  available  to  the  network  may 
lead  to  substantial  delay  when  communicating  with  that 
element  or  even  make  that  element  inaccessible.  Change 
management  precludes  these  events  from  occuring  by 
reporting,  tracking,  obtaining  approval  for,  and  verifying 
the  implementation  of  changes  in  network  components  [Ref.  5: 
p.  91],  Pencil  and  paper,  or  manual  entry  into  a  database 
are  two  methods  by  which  change  management  may  be 
accomplished. 

5.  <;onf i, au Ration  Hanagemegt 

Configuration  management  provides  for  the  creation 
cf  a  database  which  contains  the  past,  present  ,  and  future 
physical  ard  logical  characteristics  of  all  network  elements 
[Rmf.  5:  p.  91].  Included  in  this  would  be  the  SPLICE  mini¬ 
computers,  host  computers,  shared  resources,  the  subnetwork, 
and  pertinent  information  concerning  any  connected  networks. 
The  configuration  management  database  should  be  accessible 
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by  both  software  nodules  la d  users  as  needed.  Opdating  and 
maintenance  of  this  database  could  be  accomplished  in  the 
saae  manner  as  the  problea  management  database.  It  is  this 
researcher's  opinion  that  configuration  management  could 
mo3t  efficiently  be  accomplished  utilizing  autoiated  techni¬ 
ques  which  are  based  on  the  interaction  of  the  various 
network  management  modules.  i  list  of  entries  that  may  be 
included  in  a  configuration  management  record  of  a  network 
component  is  included  in  \ppendix  B. 

6 .  Operations  Manajeaent 

Operations  management  supports  the  remote  manipula¬ 
tion  of  various  network  elements  [aef.  5:  p.  91].  Some  of 
the  forms  this  mani pulatioa  takes  includes;  testing  a  piece 
of  hardware  such  as  an  adapter,  testing  specific  software 
such  as  a  process  which  counts  the  number  of  tiaes  an  indi¬ 
vidual  packet  attempts  to  access  the  channel  before  it  is 
successful,  adjusting  parameters  La  order  to  effect  the 
values  of  certain  dependent  variables  which  characterize  the 
performance  of  the  network,  and  starting  up  a  reaote  process 
within  a  node  which  acts  as  an  artificial  traffic  generator. 
Additionally,  during  the  process  of  aetwork  recoaf iguration, 
this  management  function  supports  the  remote  loading  of 
software  into  the  appropriate  network  element. 
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II.  BS3SE  1SSSSS  IS  MSZS2IS  IQiixSElSS 


Measurements  allow  as  to  gain  valuable  insight  regarding 
network  usage  and  behavior  [Bef.  6:  p.  1439].  They  provide 
a  means  to  evaluate  the  performance  of  the  implemented 
protocols,  additionally,  they  give  the  designer  the  ability 
to  detecx  network  inefficiencies  and  identify  design  flaws. 
On  an  operational  level,  measurement  provides  the  statistics 
upon  which  the  network  is  tuned  through  adjustment  of  appro¬ 
priate  parameters.  In  a  global  seise,  measurement  can  be 
seen  as  the  foundation  upon  which  network  management  is 
based.  Hamming  expresses  the  importance  of  measurement  in 
the  statement,  "It  is  difficult  to  have  a  science  without 
measurement"  This  emphasis  on  an  accurate  measurement  capa¬ 
bility  assists  in  understanding  why  such  elaborate  and 
complex  measurement  techniques  have  been  devised  for  experi¬ 
mental  and  operational  networks. 

Before  any  type  of  measurement  is  conducted  of  a  network 
or  it's  associated  components,  two  basic  questions  must  be 
answered.  They  are,  'What  is  to  be  measured?',  and  'Why 
should  the  measurement  be  taken?'.  These  questions  will  be 
addressed  in  Chapters  3  aid  4  respectively.  fct  this  time, 
an  explanation  cf  basic  monitoring  methodologies  will  be 
undertaken,  followed  by  a  discussion  of  current  monitoring 
technologies. 

k.  NETWORK  MONITORING  METHODOLOGIES 

Currently,  there  exists  three  basic  i ethodologias 
utilized  as  the  foundations  for  tie  creation  of  various 
network  monitoring  technologies.  These  three  methods  are 
hardware  monitoring,  software  monitoring,  and  hybrid 
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aonitoring.  These  methods  will  be  discussed  for  the  purpose 
of  establishing  a  basis  upon  which  the  aonitoring 
technologies  may  be  analysed. 

1.  Hapdw^re  Methodology 

A  pure  hardware  aoiitor  is  a  unit  that  is  both  phys¬ 
ically  and  logically  distinct  from  the  network  component 
being  measured  [Ref.  7:  p.  57]. 

The  interface  between  the  monitor  and  the  coaponent  is  a 
physical  probe  used  for  tae  collection  and  passing  of  elec¬ 
tronic  signals  from  the  coaponent  to  the  monitoring  device. 
Figure  2.1  depicts  a  generalized  hardware  aonitoring  device 

[Ref.  7:  p.  57]. 


Figure  2.1  Hardware  Monitoring  Device:  Logical  View. 
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The  critical  item  needed  foe  a  hardware  aonitcr  is 
an  electronic  signal  that  indicates  the  occurence  of  an 
event  [Ref.  7;  p.  57],  Since  many  signals  to  be  aonitored 
are  of  a  relatively  low  voltage,  one  aust  consider  that  the 
introduction  of  a  aonitoring  device  nay  disturb  the  normal 
operation  of  the  circuit  being  monitored.  To  preclude  this, 
a  high  impedance  probe  can  be  utiilzed.  The  signal  observed 
by  the  probe  is.  amplified  and  passe!  to  a  signal  filter  and 
combination  logic  unit.  The  task  of  the  signal  filter  and 
combination  logic  unit  is  to  mask  and  combine  signals 
received  from  various  probes.  This  output  is  then  sent  to  a 
time  and  count  unit.  Here,  the  duration  of  a  specific 
signal,  or  the  number  of  times  a  certain  signal  occures  can 
be  recorded.  Having  collected  the  reguired  data  appropriate 
for  the  test  being  conducted,  the  contents  of  the  time  and 
count  unit  can  be  directed  to  a  mass  storage  device  for  off 
line  analysis  or,  directly  to  a  user  for  on-line  analysis. 

The  main  advantage  of  a  hardware  monitor  is  it's 
ability  to  sense  a  wide  range  of  hardware  and  software 
events.  In  addition  to  cost,  the  main  disadvantage  of  a 
hardware  monitoring  device  is  it's  limited  ability  to  detect 
the  stimulus  for  the  set  of  signals  it  is  monitoring. 

2.  Softw^ys  Ssih2i2l21Z 

Although  various  definitions  exist,  a  software 
monitor  can  be  viewed  as  a  process  which  resides  in  the 
component  being  monitored.  Two  types  of  software  monitors 
exist  which  are  appropriate  for  the  task  of  monitoring  a 
computer  network.  They  are  the  interrupt-intercept  method¬ 
ology  and  the  sampling  methodology  *5ef.  7:  p.  56]. 

The  interrupt-intercept  methodology  embraces  the 
idea  of  carrying  out  soie  type  of  monitoring  activity  every 
time  the  state  of  the  particular  resorce  in  which  the 
monitor  is  resident  changes.  The  monitoring  routine  is 
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The  scheme 


invoked  whenever  an  interrupt  is  generated, 
calls  for  intercepting  each  interrupt  as  it  occures, 
directing  it  to  a  monitoring  routine  where  the  interrupt  is 
analyzed  and  appropriate  aonitoring  functions  activated,  and 
finally,  passing  the  interrupt  to  it’s  intended  destination. 
This  aonitoring  aethodology  has  the  distinct  advantage  of 
allowing  measurements  to  be  taken  as  an  integraL  part  of  the 
system  rather  than  as  a  Lower  level  application  program. 
Substantial  amounts  of  processing  tine  and  memory  utlization 
are  required  for  this  method.  Additionally,  it  also 
requires  that  the  software  monitoring  program  run  at  a  very 
high  priority  to  prevent  other  interrupts  from  deactivating 
the  monitor  [fief.  7:  p.  57]. 

The  sampling  methodology  treats  the  software  moni¬ 
toring  program  as  a  normal  user  program  for  a 
multiprogramming  system.  The  activation  of  the  monitoring 
program  may  be  accomplished  by  nhe  component  resident  oper¬ 
ating  system,  by  another  aonitoring  application  program,  or 
by  network  operations  personnel.  This  activation  may  occure 
at  random  intervals,  scheduled  intervals,  or  a  combination 
thereof.  The  selection  of  inter-sample  periods  is  critical 
in  that  it  must  not  be  synchronize!  with  the  occurence  of 
events  which  are  being  measured  by  the  monitor  'fief.  7:  p. 
57].  As  with  the  interrupt-intercept  methodology,  a  signif¬ 
icant  amount  of  processor  rime  and  memory  space  may  be 
required. 

The  principal  advantage  of  the  software  monitors 
presented  above  is  their  ability  to  associate  occurances  of 
measured  events  with  their  causes.  The  primary  disadvantage 
is  their  requirement  for  substantial  resource  utilization. 
The  strengths  of  the  hardware  and  software  monitoring  meth¬ 
odologies  have  been  combined  and  their  weaknesses  eliminated 
through  the  use  of  a  hybrid  approach. 
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3.  flx&Eia  agfckaaglaaz 

la  contrast  to  the  hardware  monitoring  methodology, 
tha  hybrid  approach  to  aonitoring  does  not  view  the  hardware 
monitoring  device  as  being  invisible  to  the  network  compo¬ 
nent.  The  hybrid  methodology  utilizes  a  microcomputer  to 
control  the  functions  of  the  hardwara  aonitoring  device  in 
response  to  data  gathered  by  hardwara  probes.  Figure  2.2 
represents  the  logical  viaw  of  a  hyorid  monitoring  device. 
Tha  data  channel  provides  a  means  by  which  the  software 
monitor  resident  in  the  device  being  monitored  can  communi¬ 
cate  with  the  hardware  aonitoring  device.  Along  this 
channel  can  pass  interrupts  and  sassages  concerning  tha 
occurence  of  software  events  within  the  component.  These  can 
then  be  associated  with  signals  sensed  by  the  probes  of  the 
hareware  aonitoring  device.  This  overcomes  tha  strict  hard¬ 
ware  monitoring  methodology's  inaoility  to  associate  a 
signal  with  a  specific  event  occurance  within  the  network 
component.  Additionally,  the  problem  of  component  resorce 
utilization  associated  with  the  strict  software  monitoring 
methodology  is  overcome  by  tha  transition  of  various  moni¬ 
toring  functions  from  the  network  component  to  the  hardware 
monitoring  device. 

Technologies  for  tie  location  of  monitoring  capabil¬ 
ities  within  a  computer  network  implicitly  utilize  one  of 
tha  methodologies,  or  a  variation  thereof,  discussed  above. 
A  number  of  these  technologies  will  be  discussed  in  the 
following  section. 

B.  NETWORK  SOIXTORING  TEZH NOLDGIES 

There  are  certain  considerations  that  should  be 
addressed  when  selecting  a  aonitoring  technology. 
Initially,  a  decision  has  be  be  iad9  on  whether  of  not  a 
record  of  every  cccurance  of  a  certain  event  should  be  made. 
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Figure  2.2  Hybrid  Monitoring  Device:  Logical  view. 


sometimes  called  trace  monitoring  [Ref.  7:  p.  53],  or  no 
collect  samples  from  the  network  at  selected  intervals  of 
time.  Timing  considerations  for  the  NBSNET  measurement 
system  indicate  complete  measurement  is  possible  [Ref.  8:  p. 
726  ].  It's  architecture,  being  similar  to  that  of  the 
SPLICE  LAN  would  seem  to  indicate  taat  complete  measurement 
would  be  possible  for  the  SPLICE  LAN.  Whether  this  would  be 
desirable  or  practical  are  questions  that  remain  to  be 
addressed.  The  technique  selected  must  be  able  to  monitor 
both  hardware  and  software  components  individually  and  any 
combination  thereof.  The  level  of  monitoring  to  be 
conducted  must  be  determined.  Does  ohe  technology  under 
consideration  provide  the  capability  of  both  a  macroscopic 


and  microscopic  level  of  monitoring?  Is  the  monitoring 
technique  capable  of  supporting  a  real-time  analysis 
requireaent?  To  what  degree  does  the  monitoring  technique 
introduce  artifact  into  the  system?  Other  items  to  be 
considered  include;  clock  resolution  and  clock 
synchronization. 

1.  Sidestream  Monitor 

The  sidestream  monitoring  technology  "Ref.  5:  p. 
92],  requires  that  probes  Oe  attached  to  the  side  of  network 
components.  By  attaching  these  probes  to  the  'side'  of 
network  components,  we  mean  physically  placing  them  such 
that  they  may  sample  and  analyze  data  from  physical  inter¬ 
faces  within  the  component,  and  at  tie  interface  between  the 
component  and  network  bus.  These  probes  extract  and  analyze 
data  frcm  physical  interfaces  established  with  these 
elements.  Additionally,  the  sidestream  technique  obtains 
information  about  the  network  interface  and  the  subnetwork 
through  the  use  of  a  meaurement  module  resident  in  the 
adaptor.  Information  gathered  by  these  probes  and  modules 
may  be  sent  to  a  network  monitoring  center,  or  to  a  set  of 
management  programs  via  a  secondary  channel  which  is 
frequency-division  multiplexed  onto  the  same  circuit  being 
used  by  the  primary  data  channel. 

A  major  advantage  of  the  sidestream  technique  is 
it's  ability  to  alert  network  operations  personnel  of 
certain  types  of  problems  without  interfering  with  normal 
data  traffic.  Certain  tests  may  aLso  be  undertaken  which 
utilize  this  secondary  channel.  In  this  way,  isolation 
testing  may  take  place  without  disrupting  the  primary  data 
channel.  3ven  though  a  secondary  channel  assists  in 
isolating  and  correcting  certain  problems,  there  still 
remain  certain  tests  that  must  utilize  the  primary  data 
channel  for  their  accomplishment.  This  has  been  found  tc  be 
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ona  of  the  major  problems  of  the  sidestream  technique  due  to 
the  fact  that,  during  the  conduct  of  these  tests,  the 
network  is  unavailable. 

The  sidestream  monitoring  technology  presented  here 
is  a  subset  of  a  more  encompassing  network  management 
philosophy.  Our  discussion  has  briefly  touched  on  the  topic 
of  component  failure  identification.  This  was  determined  to 
be  necessary  in  order  to  aore  clearly  define  and  explain  the 
advantages  and  disadvantages  of  tais  technology.  This 
subject  will  be  addressed  again  when  a  discussion  of  various 
techniques  for  identifying,  isolating,  and  correcting 
failing  network  components  is  undertaken  in  Chapter  4 

2.  Mainstream  Monitoring 

The  mainstream  aoiitoring  technique  operates  thru 
the  use>  of  hardware  and  software  implemented  among  existing 
network,  components.  These  additions  provide  data  to  a 
network  monitoring  center  or  a  set  of  network  management 
programs.  Notification  of  problems  existing  within  the 
network  is  accomplished  through  the  generation  of  asynchro¬ 
nous  problem  messages.  These  messages  are  communicated  as 
normal  data  traffic  on  the  primary  data  channel.  Data 
provided  by  these  asynchronous  problem  messages  is  usually 
sufficient  to  isolate  a  problem  to  a  particular  component 
without  further  problem  isolation  tests  such  as  those 
required  by  the  sidestream  method.  Error  records  within  a 
problem  message  contain  specific  information  concerning  the 
problem  being  reported.  Information  contained  in  the  error 
records  is  generated  by  testing  modules  resident  in  the 
network  components,  which  are  invoked  upon  problem  recogni¬ 
tion.  If  information  contained  within  the  problem  record  is 
insufficient  to  isolate  the  cause  of  a  specific  problem, 
additional  isolation  testing  is  initiated. 
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The  major  advantage  of  the  aainstreai  monitoring 
technique  is  it’s  ability  to  isolate  and  diagnose  a  problem 
based  upon  information  contained  in  the  problem  message.  A 
problem  with  this  technique  evolves  around  the  requirement 
for  these  problem  messages  to  utilize  the  primary  data 
channel  for  transmission  to  the  central  monitoring  site.  If 
an  adaptor  is  down  through  which  the  message  must  pass,  or 
if  the  subnetwork  congested,  the  problem  message  may  experi¬ 
ence  some  delay  before  being  coaauiicated  to  the  central 
monitoring  site. 

Like  the  sidestream  technique,  the  mainstream  tech¬ 
nology  presented  here  is  a  subset  of  a  more  encompassing 
network  management  philosophy.  Discussion  of  problem  iden¬ 
tification  and  isolation  was  incliied  for  clarif ication 
purposes.  Additional  discussion  on  the  subject  of  component 
failure  identification,  isolation,  and  correction  will  be 
undertaken  in  Chapter  4. 

3.  Centralized  Monitor ing 

A  broadcast  network  lends  itself  naturally  to  a 
centralized  measurement  approach  ;  Hef .  8:  p.  725], 
Centralized  monitoring  requires  modification  of  the  adaptor 
connecting  the  processor  which  houses  the  network  management 
function  to  the  bus.  Through  this  modification,  the  adaptor 
can  monitor  all  packets  on  the  network.  Some  of  the  infor¬ 
mation  which  can  be  extracted  and  determined  from  monitoring 
packets  transiting  the  network  includes;  packet  size,  number 
of  packets  of  each  type  transmitted,  and  intsrarrival  time 
since  last  packet.  Since  the  modified  adaptor  simply  makes 
a  copy  of  the  passing  packet,  extracts  the  required  informa¬ 
tion  from  it,  and  discards  the  copy,  no  artifact  is  being 
introduced  into  the  system. 
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Certain  important  information  cannot  be  obtained 
utilizing  the  centralized  monitoring  technique.  The  time 
between  arrival  of  a  packet  at  the  network  interface  and 
it's  subsequent  transmission  onto  the  network  is  only  avail¬ 
able  at  the  interface.  Thus  we  have  no  measurement  of  the 
effectiveness  of  our  access  protocol.  Although  a  collision 
on  the  network  can  be  detected  by  the  central  monitor,  it  is 
not  capable  of  determining  which  nodes  packets  were  involved 
in  the  collision. 

the  central  monitor  is  biased.  This  is  caused  Dy  the  propa¬ 
gation  delay  between  the  sending  adaptor  and  the  monitorino 
adaptor,.  Figure  2.3  depicts  a  local  network  with  central¬ 
ized  monitoring. 


Figure  2.3  Centralized  Monitoring. 
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Figure  2.4  represents  a  decentralized  monitoring 
scheme.  In  using  this  approach,  the  burden  of  network  moni¬ 
toring  is  placed  on  each  individual  interface.  The 
functions  of  the  central  monitoring  site  no  longer  include 
monitoring.  The  tasks  performed  b?  the  central  monitoring 
site  are  now  restricted  to  data  collection  from  the 
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protocols  call  for  the  transmission  of  measured  information 
after  a  certain  amount  of  time  has  elapsed,  or,  after  a 
certain  number  of  events  have  occured. 

With  a  decentralized  approach  all  information  about 
the  network  traffic  is  available  [Hef.  8:  p.  725]. 
Information  about  collision  indues!  delays  and  collision 
counts  can  be  obtained  from  each  adaptor.  Exact  times  for 
packet  transmission  and  receipt  are  available.  Another 
positive  attribute  of  decentralized  monitoring  is  the 
ability  to  identify  these  i odes  whose  packets  were  involved 
in  a  collision.  To  provide  this  enhanced  service,  addi¬ 
tional  memory  and  real  tine  clocks  mist  be  incorporated  into 
each  network  interface.  Additionally,  the  periodic  trans¬ 
mission  of  data  to  the  central  monitoring  site  requires 
overhead  communication.  If  sent  over  dedicated  lines,  as 
depicted  in  Figure  2.4  ,  extra  costs  are  incurred.  If  these 
information  packets  are  sent  over  tie  primary  data  channel, 
artifact  is  introduced  into  the  system.  Finally,  since  this 
technique  requires  that  all  adaptors  in  the  network  possess 
a  greater  than  normal  degree  of  intelligence,  implementation 
and  maintenance  tend  to  oe  more  costly  than  centralized 
monitoring. 

5.  Hybrid  Monitoring 

The  hybrid  monitoring  technique  grew  out  of  the 
advantages  and  disadvantages  of  the  centralized  and  decen¬ 
tralized  technologies.  In  this  approach,  as  much 
information  as  possible  Ls  collected  by  the  central  moni¬ 
toring  site.  Only  those  measurements  unobtainable  by  the 
central  monitor  are  measured  by  each  network  interface. 
This  allows  for  minimal  modification  to  the  network  inter¬ 
face.  Figure  2.5  represents  the  hybrid  monitoring 
technique. 


33 


Th3  transmission  of  data  to  the  central  monitoring  site  is 
initiated  upon  the  termination  of  a  logical  connection. 
Implementation  of  this  protocol  radices  the  introduction  of 
artifact  into  the  system. 


Figure  2.5  Hybrid  Sonitoring. 

In  combining  the  advantages  and  eliminating  the 
disadvantages  of  centralized  and  da ce ntralizai  monitoring, 
the  hybrid  monitoring  technology  has  provided  the  network 
with  an  accurate  and  comprehensive  measurement  and  moni¬ 
toring  capability.  One  disadvantage  deals  with  the 

complexity  of  coordinating  the  analysis  of  decentralized  and 
centralized  measurement  [Ref.  9:  p.  725  ], 
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C.  CHAPTER  SOHMAHT 


This  Chapter  began  with  an  explanation  of  hardware, 
software,  and  hybrid  Monitoring  Methodologies.  Strengths 
and  weaknesses  of  each  were  discussed.  Those  attributes, 
both  positive  and  negative,  associated  with  software  and 
hardware  monitoring  were  found  to  be  the  criteria  upon  which 
the  development  of  the  hybrid  approach  was  based. 

In  the  second  section  of  this  inapter,  implementations 
of  the  basic  methodologies  were  presented.  The  monitoring 
technologies  addressed  were;  sidestream,  mainstream, 
centralized,  decentralized,  and  hybrid  monitoring.  The 
discussion  of  each  technology  included;  a  brief  explanation 
of  the  operation  of  the  monitoring  technique,  presentation 
of  advantages  and  disadvantages,  and  in  some  cases,  compar¬ 
ison  to  other  monitoring  technologies. 

Each  one  of  the  monitoring  technologies  presented  is 
capable  of  providing  adequate  monitoring  and  measurement 
capabilities  for  use  by  the  SPLICE  LAN  management  function. 
It  is  proposed  that  the  hybrid  technique  be  adopted  as  the 
monitoring  technology  utilized  by  the  SPLICE  LAN.  This 
technique  emphasizes  the  concept  of  minimizing  data  collec- 
tionat  network  interfaces.  DaLy  those  measurements 
unobtainable  by  the  central  monitor  would  be  gathered  by  the 
adaptors.  As  in  the  mainstream  monitoring  technology,  each 
adaptor  would  be  capable  of  problem  detection  and  invoking 
local  test  modules  which  would  gatier  data  concerning  the 
problem  for  subsequent  transmission  to  the  central  moni¬ 
toring  site.  Data  collected  oy  the  adaptors  would  be  sent 
to  the  central  monitoring  site  as  administrative  packets 
over  the  primary  data  channel.  In  addition  to  the  transmis¬ 
sion  of  routine  measurement  information  upon  the  termination 
of  each  logical  connection,  problem  messages,  similar  to 
that  implemented  by  the  mainstream  technology,  will  be 
transmitted  asynchronously  to  the  central  monitoring  site. 
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To  this  point,  the  general  architecture  of  a  SPLICE  LAN, 
along  with  a  proposed  monitoring  methodology  has  been 
presented.  Now  that  a  method  exists  which  allows  us  to 
obtain  information  from  the  network,  the  focus  of  this 
thesis  changes  to  address  the  question;  What  measurements 
and  statistics  should  we  be  able  to  derive  from  the  network 
in  support  of  experimental  and  operational  functioning?  An 
attempt  will  not  be  made  to  itemize  measurements  and  statis¬ 
tics  required  for  the  accomplishment  of  each  specific 
experimental  or  operational  endevor.  Rather,  a  discussion 
of  basic  measurement  tools  will  be  undertaken,  followed  by 
the  identification  and  explanation  of  measures  and  statis¬ 
tics  appropriate  for  use  in  managing  local  area  networks 
which  must  interface  with  the  DON  and  where  control  of  the 
dominent  DDN  does  net  coma  under  the  authority  of  the  LAN 
managers. 

A.  NETWORK  NBAS URE ME NT  TOD LS 

In  order  to  evaluate  the  performance  of  a  network,  and 
to  identify  down  or  failing  components,  several  measurement 
tools  must  be  available.  These  tools  are:  cumulative 
statistics,  trace  statistics,  snapshot  statistics,  artifi¬ 
cial  traffic  generators,  emulation,  a  network  measurement 
center  which  includes  control,  collection  and  analysis  of 
data,  and  a  network  control  center  which  accomplishes  status 
reporting,  monitoring,  and  controling  the  network.  These 
latter  two  tools  may  be  combined  into  a  single  entity  which 
is  sometimes  called  a  monitoring  center.  Each  of  these 
tools  will  be  addressed  in  the  following  section. 
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Cumulative  Statistics  consist  of  data  regarding  a 
variety  of  events,  accumulated  over  a  given  period  of  time. 
These  are  provided  in  the  fora  of  suss,  frequencies,  and 
histograms  [Ref.  6:  p.  1439  ].  This  tool  is  one  that  should 
be  included  in  the  capabilities  of  the  SPLITS  local  area 
coaouter  network  measurement  facility.  Since  seme  cumula¬ 
tive  statis*ics  can  become  quite  long,  it  is  wise  to  control 
their  transmission  to  the  central  site  in  some  way.  One 
approach  might  be  to  designate  certain  items  within  a  cumu¬ 
lative  statistical  message  as  being  optional.  This  provides 
network  operations  personnel  with  many  measurement  capabili¬ 
ties,  yet  precludes  the  formulation  and  transmission  of 
excessively  long  cumulative  statistical  messages. 

2 .  Trace  S  t at i sties 

Trace  statistics  allow  network  operations  personnel 
to  literally  follow  a  packet  through  the  network  and  to 
learn  of  the  route  that  it  takes  and  t  n  e  delays  it  encoun¬ 
ters  [Ref.  10:  p.  633  ].  Obviously,  in  a  bus  oriented 
network,  there  does  not  exist  a  requirement  to  identify  the 
route  a  packet  has  taken  to  it's  destination.  Although  more 
applicable  to  a  packet  switched  store  and  forward  network, 
certain  aspects  of  a  trace  mechanism  may  prove  useful  in  a 
local  area  network.  3uca  an  area  might  include  possibly 
tiaestamping  the  packet  as  it  arrived  at  the  adaptor  from  a 
processor,  and  subsequently  recording  the  time  the  packet 
was  successfully  transmitt ed.  Additionally,  the  packet 
could  be  timestamped  whei  it  arrived  at  the  destination 
adaptor  and  subeguently  record  the  time  at  which  the  packet 
is  forwarded  to  the  resident  processor.  These  statistics 
can  then  be  forwarded  to  a  central  monitoring  site  upon 
demand  or  at  some  predeter  a ined  tine. 


3.  sa§.E§kot  statistisi 

Snapshot  Statistics  provide  an  instantaneous  look  at 
a- device  showing  it*s  state  with  regard  to  various  queue 
lengths  and  buffer  allocation  [Ref.  5:  p.  1440].  In  a  high 
speed,  dynamic  environment  such  as  a  local  area  network, 
these  types  of  statistics  can  prove  valuable  in  the  evalua¬ 
tion  of  certain  protocols.  Evaluation  of  a  network  access 
protocol  could  be  conducted  by  observing  the  length  of  the 
•packets  ready  fcr  transmission  queue'.  Additional  informa¬ 
tion  that  could  be  contaiied  in  a  snapshot  of  a  particular 
network  component  or  set  of  components  are  processor  queue 
lengths,  storage  allocation,  and  status  of  adaptor  buffers 
for  receipt  and  transmission  of  packets. 

4.  Artif ici al  Traffic  Generators 

The  use  of  artificial  traffic  generators  provides 
network  operators  with  tie  ability  to  create  streams  of 
packets  with  specified  durations,  inter-packet  gaps,  packet 
lengths  and  other  appropriate  characteristics  [Ref.  6:  p. 
1440].  This  tool  plays  a  major  role  during  the  implementa¬ 
tion  of  the  LAN.  In  the  absence  of  sufficient  traffic  to 
test  certain  aspects  of  the  network,  artificial  traffic 
generators  provide  the  mechanism  through  which  varying 
network  load  conditions  can  be  siauLated.  This  provides  a 
more  realistic  environment  in  which  testing  may  be 
conducted,  and  provides  a  mechanism  that  can  be  used  to 
identify  network  problem  areas.  By  doing  this,  we  are  able 
to  effect  modifications  to  the  LAS  while  it  is  in  it's 
infancy  rather  than  attempting  to  make  changes  when  produc¬ 
tion  activities  are  heavy  and  corrections  more  expensive. 
Additionally,  artificial  traffic  generators  may  be  used  to 
test  and  analyze  various  network  protocols.  This  is  accom¬ 
plished  through  the  generation  of  identical  transmission. 
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strings,  thereby  providing  a  basis  upon  which  the  perform- 
a nee  of  the  various  protocols  can  be  compared. 

Amer  [Ref.  8:  p.  726],  has  identified  five  capabili¬ 
ties  that  should  be  possessed  by  an  artificial  traffic 
generator.  These  include  the  abiLity  to:  H  generate 
packets  with  a  constant,  uniform,  or  Poisson  size  distribu¬ 
tion,  2)  generate  packets  with  constant,  uniform,  or 
exponential  interarrival  times,  3)  direct  packets  to  any 
specified  destination,  4|  communicate  wirh  tie  monitoring 
system  to  synchronize  traffic  generation  and  data  collection 
and  5)  permit  on-line  operations  personnel  control. 

5.  Emulation 

Emulation  is  the  creation  of  an  illusion  that  there 
exists  more  components  of  a  certain  kind  in  the  network  then 
actually  exists.  Each  one  of  these  "fake"  components  is 
caoable  of  displaying  the  characteristics  of,  and  performing 
the  functions  of  a  "real"  physical  component  of  that  type. 
Emulation  is  required  wner.  there  are  not  enough  network 
components  to  provide  sufficient  traffic  generation  and  a 
range  of  nodal  characteristics.  In  supplementing  these 
areas,  emulation  gives  the  operator  a  better  understanding 
of  network  behavior  under  various  coa figurations  .  Closely 
related  to  this  is  the  use  of  emulation  in  conjunction  with 
capacity  planning.  Througa  etnulatioa,  we  are  able  to  deter¬ 
mine  what  effect  a  change  to  the  network  configuration  will 
have  on  various  performance  measures.  A  situation  in  which 
emulation  might  be  employed  would  be  to  determine  the  affect 
of  adding  or  deleting  a  host  processor  from  the  SPLICE  local 
area  computer  network. 
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6.  Network  Measurement  /Control  z antes 

In  the  early,  experimental  days  of  the  ARPA  Computer 
Network,  there  existed  physically  separate  measurement  and 
control  centers.  Phis  allseed  for  continual  experimentation 
with  the  network  from  the  measurement  center,  while  actual 
control  of  the  network  was  conducted  from  the  control 
center.  These  two  functions  of  measuring  and  controlling 
have  been  combined,  and  will  be  undertaken  by  a  single  moni¬ 
toring  center  for  the  Defense  Data  Network  [Ref.  11:  p. 
95-102%  The  responsibilities  of  a  Network 
Measurement/Control  Center  include:  controlling  the  meas¬ 
urement  facilities,  collecting  and  analyzing  data, 
generating  status  reports,  and  monitoring  and  controlling 
the  network.  These  responsibilities,  as  they  apply  to  a 
local  computer  network,  will  be  addressed  in  mucn  greater 
detail  in  Chapter  5. 

B.  MEASUREMENTS  AND  STATISTICS 

Now  that  the  measurement  tools  aava  been  identified  and 
discussed,  the  question  arises,  ’How  do  we  select  and  imple¬ 
ment  the  appropriate  tools  for  the  measurement  task  at 
hand?'.  Before  this  subject  can  be  addressed,  the  answers 
to  two  questions  posed  in  Chapter  2  must  be  determined. 
Those  questions  were:  Why  should  tie  measurement  be  taken? 
(i.  e.  What  managerial  and  research  questions  are  to  be 
answered  by  the  measurement?)  ,  and,  »hat  is  to  be  measured? 
(i.  e.  What  specific  network  characteristics  must  be  measured 
in  order  to  satisfy  these  questions?!  . 

An  approach  to  answering  these  two  questins  is  as 
follows.  Initially,  it  is  appropriate  that  the  object  of 
the  measurement  operation  be  defined.  This  entails  the 
identification  of  some  specific  area  of  the  network  to  be 
investigated.  In  conjunction  with  this,  the  goals  of  the 
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measurement  operation  are  solidified.  The  next  step  is  to 
select  those  performance  measures  that  best  cha cacterize  the 
area  of  the  network  being  studied.  Finally,  the  specific 
measurement  tools  most  appropriately  suited  for  the  measure¬ 
ment  operation  are  identified  and  selected  for 
implementation. 

Goals  of  measurement  operations  ire  usually  motivated  by 
a  desire  for  software  verification,  for  performance  evalua¬ 
tion  and  7erficaticn,  to  obtain  feedback  for  system  design 
iterations,  to  identify  down  or  failing  components,  and  to 
study  user  behavior  and  cna racteristios.  Performance  meas¬ 
ures  can  be  catagorized  as  basic,  special,  or  composit. 
Examples  of  basic  measurements  include  throughput  and  delay. 
When  examination  of  a  specific  procedure  is  required, 
specialized  measures  must  be  used  to  compliment  the  basic 
measures.  They  are  aimed  at  measuring  a  specific  attribute 
of  a  specific  network  component.  Finally,  in  order  to 
analyze  some  global  system  properties  which  cannot  be  easily 
described  by  throughput  and  delay,  it  becomes  necessary  to 
aggregate  a  set  of  measurements  that  have  been  taken  over  a 
specific  monitoring  period.  This  aggregation  of  measures  is 
called  composite  measurement.  Examples  of  composite  meas¬ 
ures  include,  fairness,  congestion  protection,  stability, 
robustness  of  network  algorithms  to  line  errors,  and  reli¬ 
ability  of  a  network  configuration  with  respect  to  component 
failures  [Ref*  6:  p.  1443]. 

Returning  to  the  subject  of  measurement  tool  selection 
and  implementation,  we  find  that  a  subset  of  these  tools 
have  been  utilized  in  obtaining  specfic  data  from  an  opera¬ 
tional  local  area  computer  network  [3ef.  8],  In  describing 
certain  reports  generated  from  lata  collected  throughout 
this  network,  the  researcher  will  be  attempting  to  show  how 
the  tools  are  selected  and  integrated  inorder  to  provide 
network  operations  personnel  with  accurate,  timely  and 
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sufficient  information  upon  which  the  management  of  the 
network  may  be  based. 

It  would  be  infeasible  to  identify  and  provide  rationale 
for  every  measurement  that  could  be  taken  from  a  local  area 
network.  Additionally,  this  list  would  be  a  dynamic  one, 
dependent  upon  the  goals  and  objectives  of  the  specific 
network  analysis  operation  to  be  undertaken.  For  these 
reasons,  an  established  measurement  capability  for  a  local 
area  computer  network  will  be  investigated  in  an  attempt  to 
bring  forth  and  discuss  tne  implementation  of  various  meas¬ 
urement  tools  as  they  pertain  to  the  SPLICE  LAN.  Ten 
performance  reports  have  been  implemented  for  measuring 
NBSNET  traffic  [Ref.  8].  Each  report  is  classified  as 
either  traffic  charaterizat ion  or  performance  analysis  type. 
Traffic  characterization  reports  indicate  the  workload 
placed  on  the  system.  Performance  characterization  reports 
indicate  the  time  delays,  utilization s,  etc.,  which  result 
from  a  given  load  and  network  configuration.  They  describe 
the  dependent  variables  that  are  observed  rather  than  cont¬ 
rolled,  and  are  used  for  tuning  the  network  and  making 
performance  comparisons  [Ref.  3:  p.  726].  At  this  time,  a 
brief  description  of  each  report  will  be  given,  along  with 
appropriate  comments  relating  to  requirements  of,  ana  recom- 
mentations  for  the  SPLICE  LAN. 

1 .  Host  Communication  Hate ix 

The  Host  Communication  Matrix  indicates  the  traffic 
flow  between  connected  nodes.  For  each  node,  data  tabulated 
includes:  the  total  number  of  packets,  data  packets  and  data 
bytes  received  from  and  sent  to  all  other  nodes.  From  this, 
the  proportion  of  data  packets  to  total  packets,  and  data 
bytes  to  total  bytes  are  determined.  Otiliziag  the  moni¬ 
toring  technique  proposed  for  the  SPLICE  LAN  in  Chapter  2, 
this  information  could  be  obtained  by  the  central  monitoring 
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ceater  through  it's  tap  into  the  channel.  In  this  way  the 
number  of  bytes  in  a  packet  can  be  counted  by  the  monitoring 
ceater,  and  the  source,  destination,  and  packet  type  deter¬ 
mined  from  th8  header.  Additionally,  a  summary  of  the  total 
network  traffic  is  made  available  which  includes  total 
packets,  data  packets,  and  data  bytes  transmitted,  and  the 
mean  number  of  data  bytes  per  packet. 

2.  Group  Go mmunicatioa  Matrix 

The  Group  Coamunica tioa  Matrix  indicates  the  traffic 
flow  between  any  user  defined  groupings  of  nodes.  The  same 
type  of  information  tabulated  by  the  Host  Torn munication 
Matrix  is  recorded  by  the  group  communication  matrix.  The 
possible  extension  of  this  concept  to  include  the  recording 
of  the  traffic  flow  between  various  user  designated 
processes  may  prove  more  valuaole  than  the  information  orig¬ 
inally  seen  as  the  product  of  this  report.  An  example  of 
this  would  be  the  recognition  that  a  number  of  processes 
utilize  the  same  data  file  on  a  reguLar  and  possibly  concur¬ 
rent  basis.  Assuming  this  data  is  currently  kept  on  a  tape 
storage  device  (which  is  not  out  of  the  guestion  in  a 
government  installation},  and  possessing  the  information 
provided  by  the  Group/Process  Gommunication  Matrix,  fieri ous 
consideration  should  be  given  to  relocating  this  data  to  a 
faster  and  mors  accessible  storage  device. 

3.  Packet  T ype  Histogram 

The  Packet  Type  Histogram  rerords  and  summarizes  the 
distribution  of  each  type  of  packet  transmitted  on  the 
network.  &  simple  example  would  be  the  total  number  of  data 
packets  transiting  the  network  during  a  specified  monitoring 
period.  Gathering  data  to  be  utilized  in  constructing  a 
packet  type  histogram  can  be  easily  accomplished  by  a 
central  monitoring  site.  A  summary  of  packet  types  could 
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provide  network  operations  personnel  with  information 
concerning  the  amount  of  'overhead'  data  in  relation  to  the 
amount  of  'pure'  data  being  transmitted.  Additionally,  it 
may  be  found  that  there  exists  certain  times  when  the 
network  may  be  carrying  a  disproportionate  amount  of  over¬ 
head  data  as  a  result  of  component  failure,  excessive 
measurement,  or  excessive  xonitoring  requirements. 

4.  Data  £acket  Size  Histogram 

this  histogram  records  the  number  and  proportion  of 
dan  a  packets  that  fall  into  a  class  of  specified  length. 
These  classes  can  be  either  preset  or  operator  defined.  For 
packets  cf  fixed  size,  the  data  portion  alone  may  be  counted 
ana  utilized  as  the  criteria  for  class  inclussion.  Variable 
size  packets  allow  for  a  strict  count  of  bytes  making  up  the 
entire  packet.  The  use  of  a  Data  Packet  Size  Histogram  can 
be  extremely  useful  in  a  network  utilizing  packets  cf  a 
fixed  length.  If  the  average  or  mean  length  of  data  carried 
in  any  one  packet  is  substantially  below  the  carrying 
capacity  of  the  fixed  lata  field,  consideration  should  be 
giver,  to  reducing  the  size  of  the  fixed  data  field.  This 
will  reduce  the  amount  of  'excess  baggage'  being  carried  by 
packets  throughout  the  network.  Likewise,  if  packet  data 
fields  are  full  a  good  portion  of  the  time,  or  nearly  so, 
consideration  should  be  given  to  increasing  the  size  of  the 
data  field. 

5.  Throuqhput-Dtilizat ion  Distribution 

The  Throughput-Utilization  Distribution  indicates 
the  flow  of  bytes  on  the  network.  Both  information  (dataj 
bytes  and  total  bytes  are  measure!.  Information  bytes  do 
not  include  header  bytes,  or  unacknowledged  data  packets. 
Additionally,  bytes  involved  in  collisions  are  not  counted. 
Using  this  approach,  total  channel  throughput,  charnel 
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utilization,  information  throughput  and  information  utiliza¬ 
tion  can  be  determined  for  the  network  .  In  this  way,  a 
true  picture  of  the  beneficial  usage  of  the  network  can  be 
obtained.  Collecting  the  measurements  required  for  the 
creation  of  this  report  is  a  simple  task  which  can  be 
performed  by  the  central  monitoring  site. 

6.  Packet  I nterarri  val  Time  Histogram 

The  Packet  Interarrival  Time  Histogram  indicates  the 
number  of  packet  interarrival  times  which  fall  into  partic¬ 
ular  time  classes.  An  interarrival  time  is  the  time  between 
consecutive  carrier  (network  busy)  signals.  This  measure¬ 
ment  can  assist  in  determining  how  much  the  network  is  being 
used  and  what  percentage  of  the  time  the  network  is  idle 
during  a  specified  monitoring  period.  If  a  large  percentage 
of  interarrival  times  fall  into  a  rLass  which  records  occu¬ 
rences  of  large  interarrival  times,  then  it  is  safe  to 
conclude  that,  during  the  monitoring  period  in  question,  the 
network  was  not  highly  utilized.  ihsn  taking  these  measure¬ 
ments  from  a  central  monitoring  sits,  consider ation  should 
be  given  to  the  fact  that  the  recorded  interarrival  times 
will  be  slightly  biased  due  to  the  propagation  delays 
between  the  adaptors  and  the  aonitoring  site.  In  the  high 
speed  environment  of  a  local  area  network,  these  delays  are 
seen  as  being  negligible. 

7 .  Chanty  Acquisition  Delay  Historgam 

The  Channel  Acquisition  Delay  Histograam  depicts  the 
time  spent  by  adaptors  contending  for  and  acguiring  the 
channel.  The  channel  acquisition  delay  begins  when  an 
adaptor  becomes  ready  to  transmit  a  packet  and  ends  when  the 
first  bit  is  transmitted  into  the  channel.  Included  is  all 
of  the  time  spent  deferring  due  to  a  busy  channel  and  the 
time  recovering  and  backing  off  from  one  or  more  collisions. 
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From  these  measurements,  we  can  identify  for  each  interface, 
tha  nuaber  of  packets  whose  deferral  tiaes  fell  into  various 
tiie  classes.  *Vhen  using  a  CSMA/ID  access  protocol,  it 
would  be  appropriate  to  assune  that,  under  similar  condi¬ 
tions,  the  distributions  of  channel  acquisition  delay  tiaes 
for  all  adaptors  should  appear  very  luch  the  saie.  If  there 
is  some  variation,  this  is  a  good  iidication  that  some  type 
of  problem  exists  within  a  particular  adaptor. 
Additionally,  the  mean  ohannel  acquisition  delay  time  and 
it's  associated  standard  deviation  can  oe  determined  from 
data  contained  in  the  histogram.  The  collection  cf  this 
data  must  be  accomplished  by  each  individual  adaptor. 
Results  cf  the  measurements  taken  it  each  interface  must 
then  be  forwarded  to  the  central  monitoring  site  on  demand, 
or  at  some  prearranged  time . 

8.  Communication  Delay  Histogram 

The  Communication  Delay  Histogram  indicates  the 
delays  that  adaptors  incur  in  communicating  packets  to  their 
destination.  Theoretically,  a  communication  delay  begins 
whan  an  original  packet  becomes  ready  for  transmission  and 
ends  when  that  packet  is  received  by  the  destination.  By 
definition,  a  communication  delay  excludes  the  time  to 
generate  and  communicate  an  acknowledgment  packet  back  to 
the  original  sender.  As  implemented  by  the  NBSNET,  communi¬ 
cation  delay  is  measured  from  the  time  at  which  a  packet  is 
ready  for  transmission  until  the  last  bit  of  the  packet  is 
transmitted  onto  the  channel.  This  value  is  saved  until  tha 
transmission  is  acknowledged,  at  which  time  a  local  histo¬ 
gram  is  updated.  From  this  it  can  be  seen  that  measurements 
must  be  taken  by  the  adaptor  and  seat  to  the  central  moni¬ 
toring  site  upon  demand  or  at  a  predetermined  time.  With 
this  approach,  the  communication  delay  time  recorded  will 
not  include  the  time  to  propogate  the  signal  to  the 
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destination.  This  is  taken  into  consideration  when 
measuring  'one  hop*  delay.  Although  similar  to  communica¬ 
tion  delay,  'one  hop'  delay  includes  propagation  delay  time, 
and  the  time  for  the  destination  to  communicate  it's 
acknowledgment  back  tc  the  source.  The  delay,  communication 
or  'one  hop',  measured  depends  upon  the  goals  and  objectives 
of  the  measurement  operation. 

9.  Collision  ~ount  Histogram 

This  Histogram  rauilatas  the  number  of  collisions  a 
packet  of  any  type  encounters  before  being  transmitted. 
Interpretation  of  these  statistics  provides  an  indication  of 
the  efficiency  of  a  CSMA/TD  protocoL  in  allowing  interfaces 
to  acguire  the  channel.  Recording  of  collision  information 
for  each  specific  packet  aust  be  accomplished  at  the  local 
level.  Every  time  a  packet  is  involved  in  a  collision,  a 
counter  in  the  packet  header  is  incremented  by  one.  Upon 
successful  transmission,  the  number  of  collisions  incurred 
by  the  packet  prior  to  transmission  is  read  directly  from 
the  packet  header  by  the  central  monitoring  site. 
Transferring  information  in  this  manner  to  the  central  moni¬ 
toring  site  would  require  a  modification  to  the  packet 
format  proposed  in  [Ref.  2].  Tils  modification  would 
require  the  inclusion  cf  a  field  for  the  number  of  colli¬ 
sions  experienced  by  tae  packet  prior  to  successful 
tra nsmission.  By  combining  collision  count  information  from 
throughout  the  network,  the  central  acnitoring  site  is  able 
to  determine  the  mean  numoer  of  collisions  per  packet  trans¬ 
mission  and  it's  associated  standard  deviation  for  the 
entire  network. 
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10.  Transmission  Qoun  t  Histogram 

The  Transaission  CouQt  Histogram  indicates  the 
nuaber  of  times  a  packet  is  transmitted  before  it  is  commu¬ 
nicated  to  its  destination.  &  packet  is  communicated  when 
it  is  successfully  received  by  the  intended  destination.  A 
packet  may  be  transmitted  but  not  communicated  due  to  a 
collision,  line  noise,  or  erroneous  transmission.  The 
number  of  times  a  packet  is  transmitted  before  it  is  commu¬ 
nicated  can  be  detected  by  the  central  monitoring  site.  Tr 
does  this  by  observing  packet  sequence  numbers  and  is  thus 
able  to  recognize  the  first  through  the  last  times  a  partic¬ 
ular  packet  is  transmitted  and  which  transmission  is  the 
communication.  Through  the  use  of  this  histogram,  we  are 
able  to  determine  the  total  number  of  packets  transmitted, 
the  total  number  of  packets  successfully  communicated,  the 
mean  number  of  transmissions  prior  to  successful  communica¬ 
tion  ana  the  associated  standard  deviation.  Under  ideal 
conditions,  the  number  of  transmissions  per  communication  is 
one.  Ir.  a  fully  operational  network  this  will  probably  not 
be  the  case,  the  actual  value  being  dependent  upon  the  load 
on  the  system  and  the  current  network  configuration. 

C.  CHAPTER  SUMMARY 

He  beqan  this  Chapter  with  an  overview  of  the  various 
network  measurement  cools.  Ike  format  of  our  overview 
called  for  defining  a  specific  measurement  tool,  followed  by 
a  discussion  of  that  tool's  prominent  measurement  character¬ 
istics.  The  tools  discussed  were:  cumulative  statistics, 
snapshot  statistics,  trace  statistics,  emulation,  artificial 
traffic  generators,  and  a  measureaent/controi  center.  The 
topic  of  measurement  tool  selection  and  implementation  was 
than  presented.  An  approach  was  offered  as  a  means  through 
which  measurement  tool  selection  could  take  place.  This 


approach  requires  that  the  object  of  the  measurement  opera¬ 
tion  be  defined,  followed  oy  a  stateient  of  the  goals  cf  the 
measurement  operation.  tfext,  the  perforaance  measures  that 
best  characterize  the  area  of  the  network  under  investiga¬ 
tion  are  selected.  From  this,  the  measurement  tools  most 
appropriate  for  use  in  obtaining  tie  required  information 
from  the  network  are  identified  and  implemented. 

Having  concluded  that  it  would  be  infeasible  to  identify 
and  provide  rationale  foe  every  measurement  cn  at  could  be 
taken  from  a  local  area  network,  a  discussion  concerning  the 
measurements  currently  being  taken  on  an  operational  local 
area  computer  network  was  entered  into.  Ten  performance 
reports  implemented  on  tie  NBSNET  were  explained  and  their 
relevance  to  the  SPLICE  LMl  discussed.  These  reports  were: 
Host  Communication  Matrix,  Group  Communication  Matrix, 
Packet  Type  Histogram,  Data  Packet  Size  Histogram, 
Thr oughput-Utii i zaticn  Distribution,  Packet  Interarrival 
Time  Histogram,  Channel  Acquisition  Delay  Histogram, 
Communication  Delay  Histogram,  Collision  Count  Histogram, 
and  Transmission  Count  Histogram. 

The  question  ,  'How  much  of  the  network  traffic  should 
be  measured?',  was  implicitly  addressed  in  our  discussion  of 
artificial  traffic  generators.  Basically,  two  approaches 
exist.  By  measuring  everything  on  the  network  it  would  be 
possible  to  totally  reconstruct  the  original  traffic.  Some 
problems  exist  with  this  approach.  First  of  all,  there  would 
be  a  prohibitive  amount  of  storage  required  for  the  data 
collected  from  the  network.  Secondly,  the  review  and  anal¬ 
ysis  of  this  information  would  take  an  excessive  amount  of 
time.  Finally,  it  may  be  found  that  idaptors  are  spending  an 
inappropriate  amount  of  time  collecting  and  processing  meas¬ 
urement  data. 
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The  second  approach  to  network  measurement  employs  a 
sampling  technique.  Here,  performance  measurements  are 
constructed  only  from  a  subset  of  the  total  packets  tran¬ 
siting  the  network.  Measurements  can  be  randomly  taken  of 
the  normal  packets  flowing  on  the  network,  or  from  those 
packets  created  explicitly  for  measurement  purposes  by  an 
aritficial  traffic  generator.  In  the  first  case,  no  control 
is  exercised  over  the  packets  being  transmitted  through  the 
network.  In  the  second  case,  control  of  the  packets  is 
possible.  The  characteristics  and  benefits  of  artificial 
traffic  generators  have  been  previously  discussed  in  this 
thesis  and  in  [Ref.  12].  Additional  justification  for  the 
implementation  of  artifical  traffic  generators  is  provided 
by  Tobagi  in  the  statement,  "Generally,  internal  subnet 
performance  is  better  studied  in  a  controlled  traffic  envi¬ 
ronment  rather  than  in  a  real  traffic  environment"  [Ref.  6: 
p.  1442]. 

To  obtain  a  thorough  performance  analysis  of  the  SPLICE 
LAS ,  this  researcher  feels  that  network  operations  personnel 
must  be  able  to  generate  known  artificial  traffic  loads  on 
the  system.  To  implement  this  capability,  it  is  proposed 
that  each  adaptor  be  abLe  to  function  as  an  artificial 
traffic  generator.  Process  activation,  deactivation  and 
parameter  establishment  would  be  controlled  by  the  central 
monitoring  site.  Additionally,  it  is  recommended  that,  for 
specified  monitoring  periods,  the  network  possess  the  capa¬ 
bility  to  measure  every  occurence  of  certain  types  of 
events.  This  capability  is  required  in  order  to  create 
various  matrices  and  histograms  (e.g..  Host  Communication 
Matrix,  and  Packet  Type  Histogram). 

The  researcher  does  not  feel  there  exists  an  urgent 
requirement  for  an  emulation  capability.  The  composition  of 
the  SPLICE  configuration  has  been  established  and  is 


seam  to  be  in  the  area  of  additional  host  processing  capa¬ 
bility.  It  is  the  opinion  of  this  researcher  that  th9  cont¬ 
rolled  addition  of  processing  capability  will  not  tax  the 
networks  ability  to  satisf actorally  deliver  packets.  This 
conclusion  is  based  on,  review  of  a  report  dealing  with  the 
performance  evaluation  of  the  Ethernet  local  computer 
network  [  Bef .  14],  and  oa  the  assumption  that  there  exists 
enough  similarity  between  the  SPLICE  LAN  and  the  Ethernet  to 
justify  a  conclusion  of  similar  performance  under  increased 
loading  conditions. 

It  is  recommended  taat  the  tea  measurement  reports 
discussed  in  this  Chapter  be  adopted  as  the  basis  upon  which 
the  measurement  capability  for  the  SPLICE  LAN  be  estab¬ 
lished.  It  is  the  opinion  of  this  researcher  that  these 
reports  provide  an  accurate  and  fairly  comprehensive  picture 
of  network  performance  which  can  be  utilized  by  operations 
personnel  in  managing  the  network.  Additional  measurement 
reports  that  could  augment  taose  already  presented  would 
possess  the  ability  to  measure  response  time,  processor  and 
line  utilization,  characters  and  messages  received  in  error 
per  unit  time,  average  delay,  and  software  queue  lengths  and 
buffer  counts  such  as  in  adaptors  and  shared  resources. 

Uses  for  measurements  taken  from  a  computer  network 
include  performance  analysis,  and  component  failure  identi¬ 
fication,  isolation,  and  testing.  The  degree  of  success 
achieved  in  the  accomplishment  of  these  tasks  is  highly 
dependent  upon  a  comprehensive  and  accurate  measurement 
facility.  To  insure  tais  capability  continues  to  be 
provided  throughout  the  life  of  the  network,  it  is  impera¬ 
tive  that  the  measurement  software  incorporate  a  flexible 
design  in  order  to  accomidate  expansion  of,  and  modification 
to  the  measurement  tools. 
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17.  NEIWOgK  gggFOBM^MCg  MAtlSIS  *ND  COMPONESJ  ?4I£2gH 


In  Chapter  2  we  presented  and  discussed  various  moni¬ 
toring  methodologies.  The  concept  of  network  measurement 
was  ther.  undertaken  in  Chapter  3.  Using  the  knowledge 
imparted  by  these  Chapters,  we  can  now  discuss  the  topics  of 
network  performance  analysis  and  component  failure  handling. 
Basically,  network  performance  anaLysis  is  concerned  with 
evaluating  data  and  statistical  reports  obtained  by  the 
network’s  measurement  function.  During  this  evaluation 
process,  measurements  are  scrutinized  for  signs  of  component 
failure  and  inefficient  network  functioning.  additionally, 
performance  analysis  of  the  network  allows  us  to:  adjust 
network  performance  parameters  in  order  to  'tune'  the 
network,  plan  for  network  growth,  and  identify  bottlenecks 
at  various  components  throughout  the  system.  For  our 
discussion,  the  concept  of  failura  has  been  more  broadly 
defined  to  include  the  network’s  inability  to  provide  timely 
service  to  in's  users.  What  this  msans  is  that  degradation 
of  selected  performance  measures,  such  as  network 
throughput,  will  be  classified  as  a  failure  within  the 
net  work. 

Initially,  a  discussion  dealing  with  the  question,  'At 
what  time  should  the  performance  ayalysis  take  place?',  is 
entered  into.  We  then  look  at  the  function  of  performance 
analysis  as  it  pertains  to  a  local  area  computer  network. 
Finally,  a  presentation  and  evaluation  of  various  techniques 
used  in  the  detection  and  diagnosis  of  network  component 
failure  is  undertaken. 


A.  PERFORMANCE  ANALYSIS  TIHIN3 


There  are  three  tine  frames  in  which  performance  anal¬ 
ysis  can  take  place.  These  are  oa-line,  off-line,  and 
instantaneously.  Off-line  analysis  requires  the  evaluation 
of  performance  measurements  to  take  place  upon  completion  of 
the  monitoring  period.  On-line  analysis  enables  the  evalua¬ 
tion  to  take  place  during  the  monitoring  period.  Evaluating 
data  at  this  time  implys  a  delay  between  the  generation  of 
the  measurements,  their  analysis,  and  subsequent  actions 
taken  as  a  result  of  this  analysis.  Instantaneous  analysis 
is  accomplished  through  the  use  of  dynamic  control  programs. 
These  programs  provide  for  the  immediate  ayalysis  of  data 
and  statistical  reports,  followed  by  any  corrective  action 
that  may  be  required. 

1  •  Off-Line  Analysis 

Off-line  analysis  implys  that  the  records  generated 
by  the  monitoring  system  are  placed  in  mass  storage  for 
future  analysis.  Performance  analysis  is  accomplished  in 
this  way  by  the  NBS  NET  [Ref.  8],  Delay  in  corrective  action 
initiation  due  to  off-line  analysis  experienced  by  the 
NBS  NET  was  5-10  minutes  [Ref.  8:  p.  726  ].  Implementing  this 
'method'  of  performance  analysis  provides  the  analyst  with 
the  ability  to  obtain  an  overall  picture  of  network  perform¬ 
ance  before  making  any  otherwise  rasi  parameter  adjustments. 
This  method  also  allows  a  more  iaiepth  analysis  of  the 
performance  measurements  through  the  use  of  off-line  testing 
and  evaluation  programs.  An  additional  reason  for  the  use 
of  off-line  analysis  is  based  upon  the  speed  of  the  LAN. 
The  high  rate  at  which  packets  are  transmitted  means  that 
there  is  only  a  small  amount  oE  time  to  simultaneously  assi¬ 
milate  the  data  and  create  statistical  reports  upon  which  to 
act.  The  major  problem  associated  with  off-line  analysis  is 
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it’s  lack  of  responsiveness .  As  a  result  of  the  speed  of  a 
LAN,  the  environment  which  was  recorded  during  the  moni¬ 
toring  period  may  not  exist  upon  completion  of  the  analysis. 
Therefore,  any  adjustments  to  the  parameters  based  upon  the 
analysis  say  no  longer  be  applicable  to  the  current 
environment. 

2.  On-line  Analysis 

On-line  performance  analysis  enables  network  opera¬ 
tors  to  capitalize  on  the  benefits  offered  by  a  real  time 
computational  environment.  Althoigh  the  degree  varies, 
on-line  performance  analysis  is  currently  practiced  on  the 
Los  Alamos  Integrated  Comma nicatioas  Network  [Ref.  21],  and 
the  Lawrence  Livermore  National  Laboratory  Octopus  Network 
[Ref.  24],  Additionally,  the  codex  Distributed  Network 
Control  Systems  200  and  330  utilize  an  on-line  approach  to 
performance  analysis  [Ref.  25].  This  'method'  of  analysis 
provides  for:  a  more  immediate  defection,  diagnosis  and 
correction  of  network  failure,  a  greater  utilization  of 
advanced  graphic  capabilities  for  monitoring  the  network, 
and  an  increased  use  of  decision  support  capabilities  which 
can  provide  the  operator  with  suggested  courses  of  action 
and  adjustments  to  network  performance  parameters.  Two  main 
problems  exist  with  this  approach.  First,  human  interven¬ 
tion  is  still  required  for  the  adjustment  of  parameters 
inorder  to  modify  specific  network  performance  measures. 
And  second,  there  remains  considaraols  delay,  with  respect 
to  the  speed  of  a  local  area  computer  network,  between  the 
caoture  of  network  performance  measurements  and  subsequent 
action  tc  effect  their  values. 
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3.  Instantaneous  inal^ sis 


It  is  the  researchers  opinion  that  the  implementa¬ 
tion  of  ar.  instantaneous  perfociaice  analysis  capability 
could  theoretically  optimize  the  efficiency  and  effective¬ 
ness  in  which  network  evaluation  is  conducted.  Networks 
that  have  implemented  or  plan  to  implement  an  instantaneous 
performance  analysis  capability  are  the  Ethernet  [Ref.  14: 
p.  717],  and  the  Defense  Data  Network  [Ref.  19:  p.  4],  This 
technique  allows  network  operations  personnel  to  establish 
ranges  within  which  performance  measures  may  vary.  If  meas¬ 
ures  for  which  ranges  have  been  established  breech  these 
predefined  limits  during  normal  network  operations,  an 
interrupt  can  be  generated  which  initiates  a  program 
designed  to  bring  the  value  of  the  performance  measurement 
back  within  the  prescribed  range.  In  this  way  we  are  taking 

9 

maximum  advantage  of  the  computers  aoility  to  process  infor¬ 
mation  almost  instantaneously  and  thereby  providing  an 
immediate  response  to  current  network  conditions. 
Instantaneous  analysis  and  dynamic  control  of  a  network  is 
no  longer  just  a  theoretical  concept.  Advanced  installa¬ 
tions  can  now  offer  significantly  simplified  or  even 
automatic  intervention  such  as  automatic  restarts,  automatic 
renote-site  monitoring,  and  electronic  r aconf iguraton 
[Ref.  20:  p.  10].  The  major  problea  with  this  technique  is 
that  there  exists  a  loss  of  explicit  control  of  the  network 
by  operations  personnel  as  the  monitoring,  performance  anal¬ 
ysis,  and  parameter  adjustment  become  more  automated. 
Additionally,  unless  steps  are  taken  to  insure  otherwise, 
the  automation  of  these  procedures  nay  well  deprive  network 
operators  of  information  concerning  just  what  is  happening 
inside  the  network. 
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VI 


B.  LAN  PERFORMANCE  ANAL  IS  I  S 

Performance  is  the  property  of  a  system  that:  it  works, 
it  is  responsive,  and  that  it  is  available  [Ref.  15:  p.  4], 
Given  this,  our  performance  analysis  technique  must  enable 
us  to  ascertain  that  these  character istics  are  accomplished 
in  the  most  efficient  manner  possible  By  implementing  a 
performance  analysis  capability,  we  hope  to  obtain  informa¬ 
tion  that:  can  be  utilized  to  increase  system  responsiveness 
and  reliability,  will  assist  in  capacity  planning,  and 
reduce  network  operating  costs.  Additionally,  tracking 
network  performance  will  assist  operators  in  pinpointing 
more  precisely  the  nature  of  a  failure,  thereby  helping  to 
correct  it  quicker  and  reduce  component  downtime.  This 
section  includes  the  identification  of  those  performance 
metrics  that  have  been  selected  by  tas  researcher  as  those 
which  can  be  most  effectively  utilized  in  the  analysis  of 
SPLICE  LAN  performance.  Additionally,  a  discussion  of 
performance  parameter  identification  and  selection  is 
undertaken. 

1 .  Performance  Measure  Stilizition 

Utilizing  the  information  provided  by  the  ten 
reports  explained  in  Chapter  3,  we  ace  able  to  effectively 
analyze  the  performance  of  the  LAN.  The  guestion  that  must 
be  answered  now  is,  ’What  measures  do  we  look  at,  and  how  do 
we  combine  them  to  insure  a  complete  and  accurate  represen¬ 
tation  cf  the  network's  performance  is  obtained?'. 

The  selection  and  combination  of  measurements  for 
-he  purpose  of  network  performance  analysis  is  based  upon 
the  goals  and  objectives  of  the  pending  evaluation.  Our 
emphasis  will  be  on  using  performance  analysis  to  assist  in 
component  failure  detection  and  in  iuproving  the  operational 
functioning  of  the  networx.  For  this  to  occur,  acceptable 
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ranges  for  critical  performance  measures  uniar  specific 
network  loading  conditions  and  configurations  must  be  estab¬ 
lished.  Thasa  ranges  may  be  established  and  kapt  in  tables 
by  using  analytical  models  to  dynaaically  determine  these 
ranges  at  defined  intervals  for  usa  in  comparison  against 
actual  measured  performance.  7aluas  of  critical  performance 
measurements  which  do  not  fall  within  established  limits 
should  cause  an  interrupt  to  be  generated  which  inturn 
initiates  some  form  of  remedial  action  on  the  part  cf  tha 
system.  Finally,  in  orisr  no  maintain  explicit  control  of 
tha  system,  network  operations  personnel  must  be  given  tha 
ability  to  establish  and  sat  the  raiges  for  these  criteria, 
and  predefine  certain  values  taken  from  the  network  as  crit¬ 
ical  when  they  occur,  such  that  tha  occurrence  will  be 
brought  immediately  to  thair  attention. 

Many  possible  combinations  of  performance  measure¬ 
ments  exist.  Metcalfe  and  Boggs  [Ref.  18:  p.  431],  utilized 
the  criteria  of:  acquisition  probaoility_  (the  probability 
that  exactly  one  station  attempts  a  transmission  and 
acquires  the  channel),  wait  time  (the  mean  time  a  packet 
must  wait  before  successfully  acquiring  the  channel)  and 
channel  efficiency  (that  fraction  of  time  the  channel  is 
carrying  good  packets)  to  evaluate  the  performance  of  tha 
Ethernet.  This  approach  is  more  of  an  experimental  nature 
and,  in  the  opinion  of  the  researcher,  seems  to  be  limited 
in  its  usefulness  in  an  operational  environment. 

Another  possible  combination  was  suggested  by  Tobagi 
in  a  presentation  at  tie  Naval  Postgraduate  School  in 
Monterey  California  on  the  21st  of  October  1982.  One  of  the 
topics  addressed  in  that  presentation  dealt  with  identifica¬ 
tion  and  utilization  of  performance  measures  for  a  local 
area  computer  network.  These  measures  were:  bandwidth 
utilization,  system  capacity  utilization,  and  message  delay. 
By  breaking  these  down  into  more  specific  monitoring  areas. 
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we  are  provided  with  the  ability  to  obtain  a  comprehensive 
picture  of  network  performance.  These  disaggregated 
performance  measures  fall  into  two  categories.  The  first 
category  provides  for  the  evaluation  of  the  networks  commu¬ 
nication  capability  and  includes  as  criteria:  throughput  , 
response  time,  and  file  transfer  rate.  The  second  catagory 
provides  for  the  evaluation  of  resource  utilization 
throughout  the  network  and  includes:  processor  utiiiization, 
buffer  utilization,  and  line  utilization  [Ref.  16:  p.  48]. 
The  combination  cf  these  measurements,  together  with  control 
of  the  parameters  which  effect  their  values,  enable  network 
operations  personnel  or  dynamic  control  prograis  to  detect 
degrading  network  perforaance  and  take  appropriate  correc¬ 
tive  action. 

2.  Paramet  er  Selection 

Following  the  selection  of  appropriate  network 
performance  measures,  parameters  must  be  identified  which 
can  be  adjusted  in  order  no  affect  the  values  taken  or.  by 
these  measurements.  These  parameters  and  their  associated 
values  should  be  chosen  on  the  basis  of  existing  analytical 
and  simulation  results  as  well  as  previous  experiments 
carried  out  in  varied  traffic  conditions  [Ref.  22].  The 
researcher  feels  that  once  the  parameters  that  affect  the 
value  of  a  specific  performance  measure  have  been  identi¬ 
fied,  they  should  be  prioritized.  This  prioritization 
should  be  based  upon  the  parameters  effect  cn  the  perform¬ 
ance  measurement  for  a  given  system  configuration.  This 
suggests  that  there  may  exist  different  prioritizations  of 
the  parameters  for  different  configurations  of  the  network. 
One  possible  prioritization  scheme  might  call  for  the 
adjustment  of  those  parameters  first  which  have  the  greatest 
effect  or.  the  value  of  the  performance  measurement.  \.n 
important  fact  that  should  be  considered  when  selecting. 
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prioritizing,  and  adjusting  parameters  is  that  the  majority 
of  performance  measurements  and  paraietsrs  are  interrelated. 
Foe  example,  an  adjustment  made  to  increase  throughput,  such 
as  increasing  the  size  of  the  packet  data  field,  will  also 
effect  the  delay  experienced  by  the  network  user. 

Finally,  there  are  many  parameters  which  can  affect 
one  performance  measurement.  Likewise,  the  adjustment  of 
one  parameter  is  capable  of  affecting  many  performance  meas¬ 
ures.  This  being  the  case,  it  would  be  extremely  difficult 
at  this  time  to  attempt  a  listing  of  all  those  parameters 
which  affect  the  values  of  the  performance  measures  we 
presented  above.  Father,  these  would  be  more  accurately 
identified  through  the  use  of  simulation,  modeling,  and 
experimentation.  In  general,  one  cannot  identify  a  single 
tuaable  parameter  which  directly  affects  one  specific 
performance  characterizing  measure.  Instead,  one  can  iden¬ 
tify  the  two  sets  (parameters  and  measures),  and  through 
exper imentaton,  define  their  intersections  [Ref.  26:  p.  1]. 

C.  COBPOIEHT  FAILORE 

Along  with  managing  the  local  area  net vor k-ionghaul 
network  interface,  component  failure  detection  and  diagnosis 
is  seen  as  the  most  important  function  of  network  manage¬ 
ment.  The  ability  to  provide  users  with  a  responsive, 
available  network  is  of  primary  importance.  To  do  this,  we 
must  be  able  to  quickly  detect  and  diagnos  network  failures. 
Having  this  capability  will  aLlow  tie  system  to  immediatiy 
initiate  appropriate  recovery  procedures  and  restore  full 
service  to  it’s  users.  This  section  will  address  the  topics 
of  component  failure  detection,  failure  diagnosis,  and 
reporting  of  the  failure  throughout  the  network. 
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1 •  lailUti  Selection 

A  failure  detection  function  should  enable  network 
operators  to  recognize  operating  and  configuration  problems 
immediately  so  they  can  intervene  in  a  timely  fashion  to 
correct  thea  [Ref.  20:  p.  10].  Not  only  do  we  want  to  be 
aade  aware  of  catastrophic  failures,  but  also  of  gradually 
failing  conditions.  It  is  a  well  designed  perforaance  anal¬ 
ysis  capability  that  enables  us  to  be  aware  of  the  latrer. 

Component  failure  detection  within  a  local  computer 
network  can  occur  in  many  ways.  Probably  the  most  simple 
being  a  face  to  face  encounter  between  a  user  and  network 
operator,  the  subject  of  discussion  being  either  an  inoper¬ 
able  component  or  unsatisfactory  network  service.  A  phone 
call  from  a  remote  user  is  another  method  of  detection.  We 
can  also  see  a  network  operator  laboriously  reviewing  system 
statistic  reports  for  signs  of  degrading  perforaance.  Prom 
these  passive  monitoring  techniques  which  required  extensive 
operator  intervention,  the  emphasis  has  shifted  and  is  now 
on  automatic  alerts  based  on  equipment  failures,  and  in  more 
sophisticated  applications,  also  on  user-defined  limits  on 
such  items  as  transmission  volumes  and  response  time 
[Raf.  20:  p.  10].  Implementation  of  an  automatic  failure 
detection  capability  is  currently  being  planned  for  the 
Defense  Data  Network  [Hef.  19:  p.  7], 

It  is  not  the  researcher's  intent  to  suggest  that 
all,  or  any  subset  of  the  detection  methods  to  be  presented 
below  should  be  automated.  Rather,  the  author's  approach 
will  be  to  identify  and  discuss  possible  techniques  of 
failure  detection,  understanding  tnat  their  implementation 
could  take  a  variety  of  fnrms. 
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a.  Maintenance  Detection 

Failures  can  be  identified  through  normal 
network,  maintenance  activities.  Iiase  activities  may  be 
under  operator  or  prograa  control  and  aay  occur  at  prede¬ 
fined  intervals  or  on  an  as  needed  basis.  For  example,  in 
the  process  of  updating  tie  configuration  data  base,  the 
need  may  arise  to  poll  all  components  within  the  network. 
No  resonse  from  a  particular  element  aay  indicate  the  exis¬ 
tence  of  a  failure.  Additionally,  the  failure  to  receive  a 
required  maintenance  or  status  report  from  a  component  is 
another  indication  that  a  problem  lay  exist.  Testing  the 
operation  of  the  network  utilizing  artificial  traffic  gener¬ 
ation  may  also  lead  to  the  discovery  of  network 
inefficiencies.  Finally,  the  use  of  watchdog  packets 
[Ref.  8:  p.  727]  to  verify  active  and  inactive  components  is 
also  a  viable  tool  that  can  be  used  in  identifying  failing 
elements. 

fc.  Performance  Aialysis  Detection 

It  is  the  researcher's  opinion  that  the  major 
benefit  to  be  gained  from  the  performance  analysis  of  a  LAN, 
is  the  added  capability  it  gives  network  operations 
personnel  in  detecting  failed  components.  Status  reports 
generated  by  individual  components,  and  those  created  by  the 
central  monitoring  site  can  be  reviewed  for:  changes  of 
state,  obvious  trends,  and  erratic  component  performance. 
Additionally,  component  error  counts  can  be  reviewed  for 
degrading  conditions.  Two  systems  which  use  approaches 
similar  to  these  are  SNA  *3ef.  27:  p.  12],  and  the  Arpanet 
NCT  [Ref.  23:  p.  6-6].  In  SNA,  Record  Maintenance 
Statistics  are  generated  periodically  and  sent  to  the 
control  point  where  they  are  logged  and  scanned  to  detect 
degrading  component  perfccmance.  la  the  Arpanet,  IMP'S 


examine  their  own  status  a nd  send  reports  to  the  NCC  every 
minute.  Finally,  by  monitoring  the  availability  of  a  compo¬ 
nent,  which  is  defined  is  the  seen  tiae  between  failure 
(MrFB)  divided  by  (the  .1TBF  plus  the  mean  tiae  to  repair 
(HriH ) ) ,  we  are  able  to  detect  a  very  gradual  degradation  of 
that  component's  ability  to  perfora  it's  function  over  an 
extended  period  of  time. 

c.  Localized  Detection 

Detection  of  i  failure  within  a  component  can  be 
accomplished  by  the  component  itself,  assuming  the  failure 
is  not  a  catastrophic  one.  A  trap  mechanism  within  an 
adaptor  or  component  interface  is  a  'device'  which  is  acti¬ 
vated  whenever  a  certain  nardware  failure  occures  or  a  block 
of  code  is  executed.  This  mechanism  not  only  detects  the 
problem,  but  can  used  to  initiate  soma  type  of  diagnostic  or 
corrective  action.  Hardware  devices  are  also  used  for 
problem  detection  at  local  levels.  The  Arpanet  IMP  hardware 
is  capable  of  automatically  detecting  power  failures 
[Ref.  23:  p.  6-5],  while  the  Ethernet  employs  a  watchdog 
timer  which  disconnects  the  transceiver  from  the  channel  if 
it  star-s  acting  suspiciously  'Ref.  18:  p.  20].  Due  final 
method  of  local  failure  detection  is  accomplished  by  estab¬ 
lishing  a  maximum  numoer  of  retrys  foe  a  packet 
transmission.  After  a  maximum  of  15  retrys  to  transmit  a 
packet,  a  transmitter  on  the  Ethernet  gives  up  and  reports 
the  failure  condition  [Ref.  17:  p.  23]. 

Detecting  the  failure  of  an  adaptor's  attached 
component  and  any  peripherals  associated  with  it  is  also  a 
reguirement.  Detecting  the  failure  of  an  adaptors  attached 
component  can  be  accomplished  through  the  use  of  an  inac¬ 
tivity  timer.  The  purpose  of  this  timer  is  to  signal  the 
possibility  that  the  attached  component  may  have  failed. 
Once  the  timer  runs  down,  action  is  initiated  to  verify  the 
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status  of  the  component.  If  test  results  indicate  that  the 
component  is  down,  the  central  monitoring  site  is  notified. 
Finally,  it  is  assuaed  that  failure  detection  of  peripherals 
attached  to  the  network  coaponent  will  be  accomplished  by 
that  coaponent.  However,  the  requirement  exists  that  the 
status  cf  these  peripherals  be  accessible  to  local  failure 
detection  routines  inorier  that  the  central  monitoring  site 
may  be  kept  aware  of  their  condition. 

d.  Neighbor  Detection 

If  a  node  experiences  a  catastrophic  failure 
without  being  able  to  notify  the  ceatral  monitoring  site  of 
the  impending  doom,  then  we  must  have  a  method  by  which  this 
failure  can  be  detected  and  the  central  monitoring  site 
notified.  At  the  local  level  (.i.e.  without  assistance  from 
the  central  monitor)  there  exist  2  possibilities,  both  of 
which  are  based  on  the  assumption  tiat  the  failed  node  was 
involved  in  a  session  when  the  failure  occured,  or,  that 
some  other  nods  will  attempt  to  initiate  a  session  with  the 
failed  ncde  within  a  reasonable  amount  cf  cime  after  the 
failure.  Assuming  the  oode  in  question  is  involved  in  a 
session,  there  exist  two  methods  of  detection.  The  first 
method  involves  the  maximum  number  of  times  a  packet  will  be 
transmitted  without  receiving  an  acknowledgement.  If, 
during  a  session,  a  packet  is  successfully  transmitted  the 
maximum  number  of  times  without  receiving  an  acknowledge¬ 
ment,  then  the  transmitting  station  can  assume  the 
destination  is  down  and  notify  the  central  monitoring  site. 
Similarly,  if  the  destination  node  stops  receiving  packets 
from  the  source  node  without  getting  an  end  of  message  indi¬ 
cation,  it  can  assume  the  source  has  failed  and  notify  the 
monitoring  site.  Finally,  the  technique  based  on  nonreceipt 
of  acknowledgements  can  also  be  used  when  one  station  is 


2.  laiiuta  Diagnosis 


Once  a  failure  has  been  detected,  it  must  be 
located,  and  it's  cause  determined.  These  are  the  primary 
objectives  of  a  failure  diagnosis  fmction.  This  function 
■ay  be  automated  such  that  the  detaction  of  a  failure  initi¬ 
ates  a  program  which  perfroms  various  diagnostic  routines  in 
support  of  the  accomplishment  of  these  objectives.  The 
Defense  Data  Network  utilizes  an  approach  similar  to  ♦his. 
Is  planned,  the  DDN  Monitoring  Canters  will  be  capable  of 
automatically  monitoring  network  elements  to  identify, 
isolate,  and  sometimes  correct  problems  without  specialized 
maintenance  personnel  involvement. 

When  designing  a  set  of  diagnostic  tools  it  should 
be  noted  that,  f  cr  some  diagnostic  tests  and  routines,  moni¬ 
toring  and  normal  data  traffic  flow  may  be  suspended. 
Assuming  this  to  be  the  rule  rather  than  the  exception, 
diagnostic  tools  should  be  developed  accordingly.  In  the 
following  sections  we  will  identify  and  discuss  a  number  of 
these  tools. 

a.  Tests  and  Traps 

Individual  diagnostic  programs  can  be  utilized 
to  initiate  specific  tests  in  areas  of  the  failed  component. 
Additionally,  traps  can  be  utilized  to  activate  these 
programs  once  a  failure  has  been  detected.  Tests  conducted 
on  the  component  might  include  checking  all  physical  connec¬ 
tions  the  component  has  with  other  devices  and  comparing  a 
block  of  coda  in  the  cojmoonent  witi  an  imagae  of  what  that 
code  should  be. 


61 


b.  Interface  Looping 


A  good  diagnostic  tool  for  a  network  interface 
is  the  ability  cf  a  node  to  send  packets  to  itself.  .  In 
giving  a  node  the  ability  to  transmit  and  simultaneously 
receive  the  transmitted  packets,  we  are  able  to  obtain 
complete  verification  of  the  network  interface.  This  is 
whare  our  artificial  traffic  generator  comes  into  use.  »e 
can  generate  a  stream  of  packets  with  known  content,  and 
size  and  arrival  distributions.  By  checking  the  returning 
traffic  against  what  was  just  generated,  we  can  identify  any 
problems  which  may  exist  in  our  network  interface. 

c.  Dynamic  Diagnostic  Tool  (DDT) 

The  use  of  a  Dynamic  Diagnostic  (or  Debugging) 
Tool  was  introduced  by  the  Arpanet  <IZZ  [Ref.  23:  p.  6-5]. 
The  DDT  is  a  set  of  software  programs  which  are  utilized  in 
an  effort  to  diagnos  the  cause  of  a  component  failure.  The 
DDT  may  be  local  to,  or  transmitted  to  the  machine  associ¬ 
ated  with  the  failed  coapoient.  DDT  can  be  used  to  perform 
a  number  of  tests  and  operations  geared  towards  determining 
the  cause  of  a  component  failure,  these  include:  the  exami¬ 
nation  and  modification  of  a  specific  word  in  memory, 
clearing  an  entire  block  of  memory,  searching  memory  for  a 
particular  stored  value,  examining  the  contents  of  specific 
buffers  and  modifying  their  contents,  measurement  of  a 
device's  realtime  clock,  and  implanting  traps  and  interrupt 
handlers  in  a  device  suspected  of  having  software  or  hard¬ 
ware  problems. 

d.  Dump  and  Load 

If  all  other  diagnostic  methods  fail  to  deter¬ 
mine  the  cause  of  a  failure,  one  final  course  of  action 
exists.  The  entire  contents  of  main  memory  existing  within 
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the  component  at  the  tine  of  failure  is  dumped  to  off-line 
storage  where  additional  diagnosis  and  analysis  can  be 
conducted.  Simultaneously,  a  new  copy  of  the  appropriate 
software  is  loaded  into  tae  component.  If  this  procedure 
still  fails  to  correct  the  problem  or  bring  the  device  baclc 
on-line,  it  can  fce  assuaei,  with  a  high  degree  of  certainty, 
that  a  hardware  problea  exists  and  contact  of  appropriate 
vendor  personnel  is  in  order. 

3.  Failure  notification 

He  now  address  the  guestion,  'Hho  is  notified  and 
what  data  bases  are  updated  upon  the  detection  of  a  failed 
component?’.  Assuming  detection  and  diagnosis  were  accom¬ 
plished  by  a  distributed  component  (relative  to  the 
monitoring  sits)  in  the  network,  the  central  monitoring  site 
should  be  the  first  entity  notified  of  the  failure. 
Realistically,  notification  of  the  various  entities  to  be 
identified  below,  could  happen  simultaneously,  or  nearly  so. 
It  would  then  be  the  responsibility  of  ^he  central  moni¬ 
toring  site  to  notify  aiiicional  antities  and  to  update  the 
appropriate  data  bases.  These  data  bases  are  updated  by  the 
central  monitoring  site  in  basically  two  ways,  either  by 
operatons  personnel  or  oy  a  program  which  automatically 
mates  entries  into  the  appropriate  iata  bases  upon  receipt 
of  failure  alert  messages.  The  data  bases  that  must  be 
updated  include:  the  configuration  data  base,  the  problem 
management  data  base,  and  a  historical  data  base  which  is 
utilized  as  a  means  through  which  the  evolution  of  the 
network  can  be  tracked. 

There  are  a  number  of  additional  entities  which  must 
also  be  notified  upon  the  detection  of  a  failed  component. 
To  begin  with,  if  monitoring  site  personnel  were  unable  to 
restore  the  failed  component,  then  the  appropriate  vendor 
mu3t  be  contacted.  The  rest  of  the  LAS  will  be  notified  of 
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the  failure  by  the  problem  management  data  base  or  configu¬ 
ration  aanageaent  data  base  when  they  log  onto  the  network, 
or  when  they  attempt  to  utilize  the  resouroes  normally 
provided  by  that  component.  Users  attempting  to  utilize  the 
resources  of  the  failed  component  from  a  geographically 
dispersed  site  through  tie  0 DM  viLl  be  notified  of  the 
failure  in  a  manner  analogous  to  looal  users  once  they  have 
male  contact  with  the  LAN.  Finally,  those  members  of  the 
operations  staff  who  may  ba  in  the  process  of  conducting  any 
type  of  experiments  or  monitoring  activities  waich  include 
the  failed  component  must  be  explicitly  notified  of  the 
configuration  change. 

0.  CHAPTER  SOHHABT 

We  began  this  Chapter  with  a  discussion  of  the  possible 
tiie  frames  in  which  network  performance  analysis  could 
occure.  Those  discussed  were:  off-line,  on-line,  and 
instantaneous  analysis.  The  topic  cf  local  area  network 
performance  analysis  was  then  entered  into,  in  this  section 
we  discussed  performance  leasure  utilization  and  performance 
parameter  selection.  A  presentation  of  various  methods  of 
component  failure  detection  and  diagnosis  concluded  the  body 
of  the  Chapter. 

It  is  the  researcher’s  opinion  that  a  SPLICE  LAN  could 
benefit  from  each  type  of  analysis.  Instantaneous  analysis 
could  be  utilized  to  evaluate  and  effect  the  performance  of 
the  network  layer  protocol  and  below.  This  would  reduce 
management  overhead  in  that  personnel  would  not  be  needed  to 
constantly  monitor  network  status  via  a  CRT,  or  to  review 
and  analyze  printouts  reflecting  the  networks  condition. 
For  example,  adjustments  made  to  increase  throughput  during 
times  of  network  congestion,  such  as  modifying  our  backoff 
technique,  would  be  accomplished  by  a  program  rather  than 


requiring  human  intervention.  Dverhead  costs  associated 
with  the  running  of  these  programs  would  have  to  be  compared 
against  the  costs  incurred  by  non-automated  and  semi¬ 
automatic  procedures  in  order  that  efficient  and  cost 
effective  functional  implementation  Ls  achieved.  An  on-line 
analysis  capability  would  give  the  operator  a  window  through 
which  the  functioning  of  the  network  could  be  observed. 
Through  the  use  of  some  sort  of  decision  support  system,  the 
operator  could  obtain  assistance,  possibly  in  the  form  of 
suggested  action  or  in  adjusting  parameters  which  effect 
global  performance  measures.  Dff-line  analysis  would 
provide  operators  with  the  ability  to  analyze  the  perform¬ 
ance  of  the  network  in  an  environment  seperate  from  the 
system.  This  *  method'  would  remove  any  pressure  that  might 
be  experienced  by  the  operator  when  attempting  to  analyze 
performance  while  on-line. 

The  performance  measures  suggested  by  the  author  for  the 
SPLICE  LAN  are  separated  into  two  categories.  Ihe  firs-, 
category  provides  for  the  evaluation  of  the  network's  commu¬ 
nication  capability.  The  second  category  includes  measures 
which  can  be  used  to  evaluate  resource  utilization 
throughout  the  network.  Each  of  these  was  described  in 
detail  earlier  in  the  Chapter.  3anges  for  these  measures 
should  be  determined  dynamically  during  network  operation, 
however,  network  operations  personnel  must  be  able  to  over¬ 
ride  dynamic  range  establishment  and  set  their  own  range 
values  as  needed.  Numerous  parameters  exist  which  can  be 
utilized  to  effect  the  values  of  these  performance  measures. 
Rather  than  proposing  a  list  of  tunable  parameters,  that,  by 
it's  very  nature  would  be  incomplete,  the  author  offers 
three  suggestions  for  their  identification  and  utilization. 
These  parameters  and  their  associated  values  should  be 
established  on  the  basis  of  existing  analytical  models  and 
simulation  results  as  weLL  as  operational  experimentation. 
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Once  identified,  these  parameters  should  be  prioritized  in  a 
aaaner  which  reflects  their  effect  on  specific  performance 
measurements.  Finally,  the  fact  that  adjustment  of  on? 
parameter  Bay  effect  the  value  of  more  than  one  performance 
measure  must  be  considered  in  selection  and  implementation 
of  the  parameter. 

Our  discussion  of  a  failure  detection  and  diagnosis 
capability  as  part  of  the  SPLICE  LAN  will  emphasis  the 
limiting  of  these  capaoilities  pcssessed  by  components 
distributed  throughout  the  network.  The  failure  detection 
capability  of  a  distributed  component  is  limited  to  the 
identification  of  those  failures  which  cannot  be  detected  by 
the  central  monitoring  site.  The  diagnosis  capability  is 
also  to  be  similarly  restricted.  It  is  the  researcher's 
opinion  than  this  approach  will  reduce  diagnostic  software 
duplication  throughout  the  network,  eliminate  maintenance  on 
distributed  diagnostic  tools,  and  provide  for  more  central 
control  of  failure  analysis  and  proolsm  management.  Upon 
detecting  a  failure,  the  component  will  send  some  form  of 
problem  alert  message  to  the  central  monitoring  site.  From 
that  point,  the  actions  taken  by  the  monitoring  site  are 
identical  to  those  that  it  would  taka  if  it  had  detected  the 
failure.  Since  we  have  limited  the  detection  and  diagnosis 
capability  of  the  distciouted  components,  conversely,  we 
must  increase  these  of  the  central  monitoring  site.  It  is 
felt  that  periodic  status  reports  such  as  those  described  in 
Chapter  3  should  be  sent  to  the  central  monitoring  site  on  a 
regular  basis  .  There  they  can  b?  analyzed  for  possible 
signs  of  component  failure  and  system  degradation.  In  addi¬ 
tion  to  those  capabilities  alluded  to  above,  the  monitoring 
site  must  be  able  to  direct  the  transmission  of  status 
reports  from  distributed  network  components  to  itself.  It 
must  possess  a  diagnostic  tool  that  can  be  utilized 
throughout  the  network  to  isolate  and  identify  failures  in  a 
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manner  similar  to  that  of  the  DDT  employed  by  the  Arpanet 
Network  Control  Center.  Sore  details  concerning  the  func¬ 
tioning  of  a  central  monitoring  site  will  be  discussed  in 
Chapter  6.  It  is  sufficient  to  conclude  at  this  time  that, 
by  centralizing  the  majority  of  the  network’s  failure  detec¬ 
tion  and  diagnositic  capabilities,  we  are  increasing  control 
of  the  failure  handling  procedures  of  the  network,  reducing 
software  duplication  and  maintenance,  and  minimizing  costs 
associated  with  the  implementation  of  a  failure  detection 
and  diagnostic  function. 
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V.  MliSIKi  JO  tAH^DDH  ISISSF1CE 


As  stated  in  Chapter  4,  along  with  failure  identifica¬ 
tion  and  correction,  the  Dost  important  function  of  LAS 
management  is  the  monitoring  and  control  of  the  local  area 
network  to  long  haul  network  interface.  This  function  is 
primarily  concerned  with  regulating  the  flow  of  packets 
between  the  networks  and  any  other  tasks  which  support  it's 
accomplishment.  In  this  Chapter,  our  emphasis  will  be  on 
identifying  and  discussing  the  managerial  aspects  associated 
with  the  interconnection  of  two  networks. 

A  fundamental  aspect  of  internetwork  communication  is 
the  establishment  of  agreed  upon  conventions.  “ommunicating 
entities  must  share  some  physical  transmission  medium  and 
they  must  use  common  conventions  or  agreed  upon  translation 
methods  [Ref.  29:  p.  1392].  This  required  commonality  can 
be  achieved  in  a  number  of  ways.  Protocols  of  one  net  can 
be  translated  into  those  of  another,  or,  common  protocols 
could  be  defined.  Another  method  through  which  commonality 
may  be  achieved  calls  for  conversion  to  a  standard  interface 
by  all  networks.  Ir  is  the  researcher's  opinion  that  the 
connection  of  long  haul  networks  to  local  area  networks  does 
net  lend  itself  to  the  establishment  of  common  protocols 
that  would  be  efficient  for  both  networks.  Additionally, 
the  benefit  to  be  derived  from  converting  to  a  standard 
interface  is  only  realized  if  a  network  is  connected  to  more 
than  one  other  network.  Ef  connected  to  only  one  network, 
utilizing  a  standard  interface  would  require  two  protocol 
translations.  Network  A* s  protocol  would  require  transla¬ 
tion  into  a  standard  interface  protocol  which  would  then 
require  translation  into  network  B's  protocol  upon  arriving 
at  the  connected  network.  Whereas,  the  use  of  protocol 


63 


translation  would  only  require  the  conversion  of  A’s 
protocol  to  B's,  or  vice  versa,  depending  upon  the  direction 
of  packet  flow.  Therefore,  we  will  consider  the  issue  of 
managing  the  interface  between  two  networks  fro*  the  stand¬ 
point  of  protocol  conversion,  rather  than  from  coaaoa 
protocol  or  standard  interface  establishment . 

There  are  many  differences  which  exist  between  networks 
that  must  be  resolved,  tnose  that  will  be  covered  in  detail 
in  this  Chapter  include:  naming  and  addressing,  flow  and 
congestion  control,  paccet  size,  and  access  control. 
Additional  areas  to  be  discussed  will  encompass  gateway 
configuration,  internetwork  accounting,  and  dispersal  of 
network  status  information. 

A.  GATEWAY  CONP IGU RATION 

In  an  effort  to  set  the  stage  for  a  discussion  dealing 
with  the  management  of  an  interface  between  two  networks,  it 
is  felt  that  an  understanding  of  possible  gateway  configura¬ 
tions,  or  levels  of  interconnection  as  dubced  by  Cerf 
[Ref.  29:  p.  1392],  will  prove  beneficial.  There  are  a 
variety  of  different  ways  in  which  the  gateway  between  two 
packet  switched  networks  may  be  configured  [Ref.  28:  p. 
4-1*9],  We  will  briefly  describe  each  one  and  discuss  why  it 
should,  or  why  it  should  not  be  considered  for  the  SPLICE 
network.  Finally,  for  explanatory  purposes,  we  will  select 
a  configuration  and  use  it  as  an  example  throughout  the 
Chapter. 

Utilizing  a  common  host  is  a  simple  and  very  straight 
forward  approach  that  can  be  used  to  connect  two  networks. 
This  method  connects  two  networks  through  a  host  that  is 
attached  to  the  two  networks.  This  configuration  can  be 
ruled  out  immediately  from  consideration  for  the  SPLICE 
network.  This  is  because  the  entire  SPLICE  program  is  based 
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upon  relieving  the  host  (31  of  comiunicat ion  responsibili¬ 
ties.  To  burden  the  ho3t  computer  with  anything  but  the 
processing  of  application  programs  would  be  entirely  against 
the  SPLICE  concept. 

Another  approach  to  interconnecting  packet  switching 
networks  would  be  to  have  a  switching  node  which  is  common 
to  both  of  thee.  Thi3  method  must  also  be  ruled  out  froa 
consideration.  First  of  all  ,  the  LAN  does  not  possess  a 
switching  node.  An  atteapt  light  be  aade  to  combine  the 
functions  cf  a  DOS  switching  node  and  the  LAN’  front  end 
processor  (FEP) .  Although  a  technically  feasible  solution, 
the  drawbacks  are  aa jor  and  nuaerous. 

An  internoda  device  can  be  used  as  a  separate  entity  to 
perform  only  gateway  functions  between  each  of  the  networks 
to  be  interconnected.  This  gateway  is  normally  designed  to 
appear  as  a  special  host  to  each  network.  This  approach 
provides  the  most  acceptaole  alternative,  however  it  is  the 
author's  opinion  that  the  requirement  for  additional  hard¬ 
ware  to  perform  the  interconnection  of  two  networks  is  not 
supportive  of  the  SPLICE  concept. 

The  final  possibility  for  a  gateway  configuration 
utilizes  the  existing  capabilities  of  a  DDN  switching  node 
(IIP)  ,  and  the  local  area  network  FSP.  This  configuration 
is  called  the  "two  half-gat eway" .  In  the  "two  half-gateway" 
approach,  a  gateway  is  composed  of  two  halves,  each  associ¬ 
ated  with  it's  own  network.  Each  haLf-gateway  would  only  be 
responsible  for  translating  between  the  internal  packet 
format  of  it's  own  network  and  soae  common  internetwork 
format.  The  number  and  different  types  of  networks  the  DDS 
ties  into  will  dictate  whether  or  not  an  approach  of  this 
nature  is  optimum.  For  the  time  being,  no  standard  inter¬ 
network  format  has  been  proposed.  This  being  the  case,  a 
slight  modification  to  this  approaca  should  make  it  usable 
and  efficient  for  connectiag  a  SPLICE  local  area  network  to 
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the  DON .  This  change  weald  require  a  conversion  from  the 
internal  protocol(s)  of  the  local  area  network  to  the 
protocol  (s)  of  the  Defense  Data  Network  and  vice  versa, 
depending  on  the  direction  the  packets  are  flowing. 

For  the  remainder  of  tils  Chapter,  we  will  utilize  the 
"two  half-gateway"  as  our  basis  for  explaining  the  differ¬ 
ences  that  must  be  overcome  when  connecting  two  networks, 
and  the  functions  which  mu3t  be  acconplished  by  the  gateway. 
For  cur  discussion,  it  may  help  to  picture  one  half  of  the 
gateway  implemented  in  the  LAS  FEP,  and  the  other  half  of 
the  gateway  resident  in  a  DDN  switching  node  which,  in 
conjuction  with  the  LAN  PBP,  allows  communication  between 
the  two  networks  to  be  achieved.  Finally,  a  number  of 
assumptions  have  been  made,  which  are  felt  will  add  clarifi¬ 
cation  to  concepts  discussed,  and  provide  a  basis  upon  which 
analysis  can  be  conducted  and  proposals  made. 

9 

1)  The  LAN  cannot  affect  the  speed  at  which  packets  transit 
the  DDN. 

2)  The  LAN  FE?  cannot  increase  the  rate  at  which  packets  are 
sent  to  it  from  the  switching  node  past  the  maximum 
transmission  rate  of  that  node. 

3)  The  switching  node  that  the  LAN  ties  into  may  also  act  as 
an  IMP  through  which  other  hosts,  not  part  of  the  LAN, 
access  the  DDN. 

4)  Error  control,  flow  control,  and  duplicate  packet  detec¬ 
tion  is  provided  foe  communication  between  the  LAN  FEP 
and  the  DDN  switching  node  by  one  of  the  network  access 
protocols  supported  by  z he  DDN.  In  this  situation,  the 
switching  node  merely  views  the  front  end  processor  as 
another  host. 
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B.  PACKET  SIZING 


The  problem  of  differences  in  packet  size  is  basically 
one  of  coping  with  the  f ragmentation  that  must  inevitably 
occur  when  the  two  interconnected  networks  employ  different 
internal  maximum  packet  sizes  *  Bef.  28:  p.  *-49],  Two  situ¬ 
ations  may  exist,  one  is  when  the  maximum  packet  size  for 
the  LAN  is  greater  than  that  of  the  Long  haul  network  (LHN) , 
the  other  being  when  the  maximum  packet  size  for  the  long 
haul  network  is  greater  than  the  maximum  packet  size  for  tne 
local  area  network. 

The  first  case,  when  LAN  maximum  packet  size  is  greater 
than  the  long  haul  network  maximum  packet  size,  can  be 
handled  in  one  of  three  ways.  First,  if  the  packet  to  be 
transmitted  from  the  LAN  to  LHN  is  smaller  than  the  LHN 
maximum  packet  size  by  at  least  the  number  of  additional 
overhead  bytes  that  will  oe  added  on  &y  the  packet  switching 
node  once  the  packet  reaches  the  DDN,  then  the  packet 
reguires  no  size  modification  before  being  sent  to  the 
switching  node.  Second,  if  the  paccet  to  be  transmitted  is 
larger  than  -.he  LHN  maximum  packet  size,  we  nay  fragment  the 
packet  appropriately  in  the  FEP.  Each  packet  would  be  frag¬ 
mented  such  that  the  new  packets  would  be  smaller  than  the 
aaximum  packet  size  for  the  LHN  even  after  the  overhead 
bytes  were  added  by  the  LHN  switching  node.  A  number  of 
problems  exist  with  this  approach  which  include,  a  require¬ 
ment  for  increased  software  capabilities  at  the  FEP, 
additional  delay  experienced  by  packets  wanting  to  leave  the 
network,  and  the  possibility  that  resequencing  of  all  the 
packets  making  up  the  message  oeing  sent  may  be  required  due 
to  the  insertion  of  a  ’’new"  packet  into  the  sequential 
secies  cf  packets  that  have  been  transmitted  from  somewhere 
in  the  LAN.  And  finally,  if  the  packet  to  be  transmitted  is 
larger  than  the  LHN  maximum  packet  size,  we  may  just  go 
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ahead  ana  send  the  packet  to  the  switching  node.  We  are 
able  to  do  this  because  the  ODD  Standard  Internet  Protocol, 
which  will  be  implemented  by  the  Defense  Data  Network, 
provides  for  a  fragmentat ion/reasseably  service.  It  is 
envisioned  that  the  "ever-sized"  packet  would  be  fragmented 
with  each  piece  being  sent  to  the  destination  switching  node 
where  the  fragments  would  be  reassembled  back  into  the 
"over-sized"  packet. 

In  addressing  the  secoid  case,  where  LHN  maximum  packet 
size  is  greater  than  LAN  maximum  paotet  size,  we  assume  that 
the  fragmentation  of  a  smaLler  LAN  packet  to  help  fill  up  a 
partially  filled  larger  LHN  packet  will  not  occur.  In  thi,s 
situation,  the  main  concern  of  the  LAN  is  that  it  might 
receive  a  packet  from  the  DDN  which  is  larger  than  it’s 
maximum  packet  size.  This  being  the  case,  the  LAN  F2P  must 
possess  the  capability  to  fragment  the  larger  packet  into 
packets  suitable  for  transmission  on  the  local  area  network. 

C.  CONGESTION  CONTROL 

Assuming  probabilistic  message  generation  and  fixed 
capacity  in  network  components,  overLoad  would  be  inevitaole 
without  certain  mechanisms  to  stop,  slow  down  or  absorb  the 
rate  of  message  arrival.  The  basic  tool  utilized  in  the 
accomplishment  of  these  tasks  is  congestion  control. 
Congestion  control  can  be  defined  as  a  procedure  whereby 
distributed  network  resources,  suca  as  channel  bandwidth, 
buffer  capacity,  and  CPI  capacity  are  protected  from  over 
subscription  by  all  sources  of  network  traffic  'Ref.  29:  p. 
1400].  Congestion  is  most  likely  to  be  visible  at  a  gateway 
connecting  a  local  area  network  to  a  long  haul  network.  In 
soie  cases,  the  transmission  rates  of  LAN’s  might  exceed 
those  of  long  haul  networks  by  factors  of  30-100  or  more 
[Ref.  29:  p.  1400  ].  There  are  basically  two  schools  of 
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thought  when  it  comes  to  dealing  with  the  problem  of 
congestion  control.  There  are  tho3e  who  advocate  rigid'" 
controlling  the  input  of  pickets  into  a  network  and  expli 
itly  rule  out  the  discarding  of  packets  as  a  means  of 
congestion  control.  And  conversly,  there  ire  some  who 
promote  the  dropping  of  packets  is  the  sole  means  of 
controlling  congestion  [8ef.  29:  p.  1*00].  We  will  look  at 
congestion  and  flow  control  it  the  interface  between  two 
networks  from  both  of  these  viewpoints.  It  is  the  author's 
intention  to  propose  and  discuss  techniques  of  congestion 
and  flow  control  for  receipt  ot  packets  from  the  LAM  and  for 
receipt  of  packets  from  tie  DDN  by  the  half  of  the  gateway 
resident  in  the  LAN  FEP. 

1.  LAN  to  L HN  Packet  Control 

The  author  has  concluded  that  there  exist  numerous 
methods  of  congestion  control,  many  of  which  have  yet  tc  be 
identified.  The  discussiaa  which  fellows  includes  the  pres¬ 
entation  of  three  possible  methods  of  gateway  congestion  and 
flow  control.  These  methods  deal  with  the  handling  of 
packets  received  from  the  LAS  by  the  front  end  processor 
destined  for  the  DDN. 

The  simplist  method  of  congestion  control  provides 
for  the  immediate  transmission  of  packets  to  the  DDN.  If 
the  gateway  portion  of  tae  FEP,  ia  conjunction  with  it's 
associated  switching  node,  is  able  to  successfully  transmit 
packets  to  the  DDN  faster  than  they  arrive  from  the  LAN, 
than  we  can  assume  the  requirement  for  congestion  and  flow 
control  is  minimal  in  than  direction.  However,  the  author 
has  concluded  that  this  is  rarely  tae  case.  This  approach 
would  inevitably  lead  to  the  loss  of  packets  due  to  the 
gateways  inability  to  transmit  them  at  a  rate  comparable  to 
that  of  LAN.  R ecovery/ retr ansmissi on  of  those  'lost' 
packets  to  the  gateway  would  be  left  to  the  lower  level 
protocols. 


Another  method  through  which  congestion  control  at 
the  FEP  could  be  accomplished  would  be  through  the  addition 
of  buffers.  Packets  flowing  in  from  the  LAM  could  be  gu6ued 
in  a  buffer  for  subsequent  transmission  to  the  long  haul 
network.  Once  this  buffer  becomes  full,  packets  could  be 
discarded  as  in  the  first  method  or  a  signal  of  some  type 
cojld  be  sent  throughout  the  network  indicating  that  the  DDM 
output  buffer  was  full.  Receipt  of  this  message  would  also 
imply  that  no  internetwork  traffic  should  be  sent  until  a 
message  is  received  from  the  gateway  indicating  that,  the 
buffer  is  empty  and  internetwork  traffic  transmission  can  be 
resumed. 

This  technique  could  also  be  employed  with  two 
buffers.  Once  one  buffer  was  full,  it  would  be  disabled 
from  receiving  additional  packets  while  transmission  took 
place.  Simulta neoulsy,  the  second  buffer  could  be  filled 
and  it's  contents  transmitted  when  the  first  buffer  became 
empty.  While  the  second  ouffer’s  contents  were  being  trans¬ 
mitted  to  the  DON,  the  first  buffer  would  be  receiving 
packets  from  the  LAN.  This  alternating  technique  could  be 
employed  with  N  buffers  ,  but  this  would  be  at  the  expense 
of  loosing  N  buffers  worth  of  memory  space  in  the  FEP.  This 
being  the  case,  a  limit  to  the  buffer  space  allocated  to 
internetwork  traffic  wouLd  have  to  be  established.  With 
this  limited  buffer  space,  there  still  exists  the  possi¬ 
bility  that  all  buffers  may  becoae  full  simultaneously. 
This  would  require  incoming  packets  to  be  discarded  or, 
notification  throughout  tae  network  that  buffers  are  full 
and  internetwork  packet  transmission  is  disabled. 

A  final  method  by  which  the  flow  of  traffic  from  the 
LAM  to  the  LHM  can  be  controlled  is  through  the  use  of 
external  storage  areas.  This  technique  is  very  similar  to 
the  buffering  methods  presented  above.  Buffers  are  utilized 
in  the  same  fashion  but,  when  they  become  full,  rather  than 
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discarding  packets  or  notifying  the  network  of  the  buffers 
state  ,  all  incoming  packets  ar?  directed  to  external 
storage  areas.  When  the  buffers  begin  to  eapty,  packets 
currently  being  stored  are  directed  to  the  output  buffers  on 
a  FIFO  basis.  This  procedure  reduces  congestion  on  the  LAN 
by  not  requiring  the  contiiual  retransmission  of  packets  not 
previously  accepted  by  the  gateway.  Addionally,  it  elimi¬ 
nates  the  need  for  distributed  oosponents  to  be  able  to 
recognize  a  "DDN  buffers  full  message"  and  carry  out  the 
internetwork  packet  restricting  action  necessitated  by  it's 
receipt. 

2 .  LHN  to  LAN  Packet  Control 

As  previoulsy  stated,  we  are  assuming  that  the  flow 
of  packets  between  the  FE?  half  of  tae  gateway  and  switching 
node  half  of  the  gateway  is  controlled  by  the  network  access 
protocols  supported  by  the  DON.  Tiis  being  the  case,  our 
discussion  is  restricted  to  answering  questions  such  as: 
•Should  we  transmit  each  packet  imaediately  onto  the  LAN 
upon  it's  receipt  from  the  DDN?  * ,  or  'Should  we  employ  some 
buffering  techniques,  accumulating  some  packets  before 
transmitting  them  onto  the  LAN?*. 

It  is  the  author's  opinion  that  the  trickling  of 
packets  onto  the  network  one  at  a  time  does  not  efficiently 
utilize  the  capabilities  of  a  13  Hbit/sec  LAN.  This  method 
reduces  network  throughput,  ar.i  requires  adaptors  and  compo¬ 
nents  to  "wait"  longer  between  internetwork  packet  arrivals. 
By  storing  the  internetwork  packets  in  buffers  or  dedicated 
external  storage  areas,  we  are  able  to  transmit  packets  onto 
to  the  LAN  in  bursts.  These  transnissias  can  occure  after 
an  entire  message  is  received  or  after  a  certain  number  of 
packets  hav9  accumulated  in  the  buffers  or  external  storage 
areas. 
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D.  ADDBESSIHG  k HD  U1MIHG 


Whenever  any  two  devices  must  communicate  with  each 
other  and  they  are  not  directly  connected  (i.e.  a  processor 
on  one  network  communicating  with  a  processor  on  another 
network)  ,  the  question  of  addressing  the  proper  recipient 
becomes  a  major  consideration.  Addressing  across  network 
boundaries  requires  either  a  standard  network  numbering 
scheme  or  a  means  of  address  translation  in  the  gateway 
(Ref.  28:  p.  4-49],  In  is  known  taat  the  DOS  will  connect 
with  existing  networks  as  well  as  the  SPLICE  local  area 
networks.  It  is  the  author's  opinion  that  this  in  itself  is 
sufficient  to  justify  tie  establishment  of  a  standard 
nuibering  scheme.  This  will  therefore  be  the  premise  upon 
which  our  discussion  will  be  based. 

Many  different  possible  internetwork  addressing  schemes 
exist.  The  CCITT  X.121  addressing  strategy  is  based  on  the 
telephone  network  system.  This  -Technique  allows  up  to  14 
digits  per  address.  The  first  4  digits  are  a  destination 
network  identifier  code  (DiJICi,  foLlowed  by  the  remaining 
digits  which  may  be  used  to  implement  a  hierarchical 
addressing  structure  (Ref.  29  p.  1403],  The  D A 5? A  Internet 
has  implemented  a  common  address  format  across  all  networks 
it  connects  (Ref.  30:  p.  114].  Tie  Internet  address  length 
is  fixed  at  32  bits.  These  bits  contain  the  address  of  a 
particular  network,  and  the  address  of  a  host  within  that 
network.  A  further  disaggregation  of  this  concept  might 
call  for  an  address  field  which  contained  a  network  address, 
the  address  of  a  packet  switching/gateway  node  within  that 
network,  and  the  address  of  a  host  accessible  through  that 
node.  »e  will  utilize  the  addressing  technique  implemented 
in  the  Internet  as  the  basis  for  the  remainder  of  our 
discussion. 
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In  order  to  Manage,  control,  and  support  communications 
among  components  distributed  throughout  two  or  more 
networks,  a  means  must  exist  for  explicitly  identifying  the 
coiponeats  involved  in  tie  communication.  This  could  be 
accomplished  by  utilizeing  one  of  tie  addressing  strategies 
presented  above.  In  implementing  this  strategy,  rather  than 
reguiring  the  user  to  be  aware  of  the  structure  of  the 
network  in  which  the  destination  host  resides,  a  naming 
convention  could  be  established  whin  relieves  him  of  indi¬ 
cating  the  actual  address  cf  the  desired  host.  A  naming 
convention  can  also  be  established  for  identifying  the 
network  to  be  accessed  rather  than  requiring  a  specific 
address  to  be  provided. 

Assuming  an  operator  may  now  use  names  to  identify  both 
the  desxination  network  and  the  host  within  that  network, 
the  task  of  converting  these  to  actual  network  addresses 
must  be  considered.  Translation  of  the  network  name  to  a 
specific  network  address  will  be  accomplished  by  the 
switching  node  through  which  a  SPLLCE  LAM  is  connected  to 
the  Defense  Da'ra  Network.  Currently,  nodes  attached  to  the 
DDM  may  be  known  by  as  many  as  four  different  names 
[Raf.  31s  p.  111].  The  translation  of  a  local  host  name  to 
its  associated  address  and  vice  versa,  could  be  accomplished 
by  the  switching  node.  The  author  does  not  support  this 
approach  for  the  major  reason  that  the  switching  node  will 
most  probably  be  connecting  other  networks  and/or  hosts  to 
the  DDN.  For  it  to  possibly  perform  these  translations 
would  mean  a  reduction  in  the  node's  capability  to  perform 
its  primary  functions  of  traffic  processing,  host  access, 
routing,  and  monitoring  and  control  [Ref.  31:  p.  33]. 
Therefore,  local  host  name  translation  must  be  performed  at 
the  local  level. 
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This  local  translation  capability  could  be  accomplished 

at  the  interface  between  a  distributed  component  and  the 
« 

bus.  This  would,  require  the  use  of  additional  component 
resources  for  the  performance  of  a  function  which  could  most 
efficiently  be  implemented  at  a  centralized  location  (i.e. 
the  Front  End  Processor),  rather  than  at  each  individual 
component.  By  incorporating  the  local  translation  capa¬ 
bility  into  the  LAN's  FEP,  we  not  only  reduce  redundancy 
throughout  the  system,  but  also  facilitate  the  maintenance 
of  our  translation  tables.  The  final  issue  to  be  addressed 
is  concerned  with  the  place  (source  or  destination  network), 
at  which  this  translation  occurs.  Translation  of  the  desti¬ 
nations  name  can  either  occur  at  the  source's  gateway  or  at 
the  destination  network.  By  delaying  translation  cf  the 
name  to  an  address  until  arrival  at  the  destination,  we 
eliminate  the  requiraaeit  for  each  gateway  to  possess 
specific  address  information  aoout  other  networks. 
Similarly,  the  translation  of  a  process  name  to  a  process 
address  would  also  be  accomplished  by  the  destination 
networks  FEP.  The  half  gateway  resident  in  each  SPLICE  FEP 
would  only  be  required  to  maintain  a  table  containing  the 
names  and  addresses  of  it's  local  components  and  processes. 
Open  receiving  a  packet  from  the  DON,  the  component  and 
process  string  names  would  be  compared  against  entries  in 
the  address  translation  tables.  Appropriate  addresses  would 
replace  the  physical  node  name  and  process  name.  The  packet 
would  then  be  ready  for  transmission  onto  the  local  bus. 

E.  ACCESS  CONTROL 

Access  control  is  concerned  with  establishing  mechanisms 
that  may  be  required  to  prevent  soma  traffic  from  entering 
and  possibly  some  traffic  from  leaving  the  network.  This 
filtering  action  is  ideally  accomplished  by  the  gateway  two 
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networks.  Utilizing  our  model  of  a  "two  half-gateway",  each 
half  can  deal  with  controlling  access  to  the  network  that  it 
is  connected  to.  Hhat  this  means  is  that  our  half  of  the 
gateway  in  the  LAN  FEP  can  act  as  a  sentry  to  incoming 
traffic.  As  traffic  arrives,  the  "ID"  of  the  packet  (s)  can 
be  checked  against  a  tabla  containing  the  "names"  of  those 
packets  which  are  authorized  to  enter  the  LAN.  If  a  packers 
"ID"  appears  on  the  access  list,  entry  is  granted,  if  not, 
the  sentry  may  either  discard  these  packets  or  possibly  send 
them  to  an  access  controller  [Isf.  29:  p.  1401].  The  access 
controller  routine  can  tien  dynamically  enable  the  flew  of 
the  packers  into  the  network  after  performing  certain  checks 
on  the  packets  identity,  or,  it  may  decide  that  these 
packets  are  not  to  be  allowed  into  the  network,  discard 
them,  and  send  a  suitable  'canned'  response  to  the  source  of 
the  packet  (s)  letting  it  know  access  was  not  granted. 
Alternately,  it  may  inform  network  operations  personnel  of 
the  packets  that  wish  to  enter  tie  network  and  request 
action  to  be  taken. 

F.  OTHEfi  CONSIDERATIONS 

Two  additional  areas  of  concern  associated  with  the 
interconnection  of  two  networks  are  failure  notification  and 
accounting  procedures.  Assuming  that  failures  for  the 
connected  networks  are  detected,  identified,  and  isolated 
internally  by  each  network,  ta e  question  arises  'Hew  is  the 
existance  of  a  failed  component  within  a  network  communi¬ 
cated  to  those  in  other  networks  wno  may  wish  to  use  that 
coipcner.t?' .  Assumming  tnat  both  the  LAN  configuration  data 
base  and  problem  management  data  base  have  both  been  updated 
with  the  current  status  of  any  particular  failure,  the 
researcher  makes  the  following  proposals  in  response  to  the 
previous  question.  Before  packets  are  let  into  the  local 


area  network,  the  half-gateway  will  be  responsible  for 
checking  these  data  bases  to  insure  that  the  desired  desti¬ 
nation  is  operational.  If  it  is,  and  assuming  the  access 
controller  has  permitted  access,  the  packets  are  transmitted 
into  the  network.  If  the  desired  destination  is  currently 
inoperative,  a  response  Indicating  such  is  returned  to  the 
source.  Additionally,  if  a  source  from  another  network 
desires  to  check  the  status  of  an  element  within  a  SPLICE 
LAM,  it  should  have  the  capability,  just  as  a  local  user 
would,  cf  querying  either  one  of  these  data  bases.  Also,  it 
is  assumed  that  if  the  switching  node  through  which  a  SPLICE 
LAN  is  connected  to  the  33N  fails,  than  the  responsibility 
of  reporting  the  inaccessibility  of  that  particular  local 
area  network  lies  with  the  DDN  Monitoring  Center  who's 
juristiction  includes  the  failed  node.  Similarly,  the 
failure  of  a  LAN  FEP  which  makes  a  SPLICE  LAN  configuration 
inaccessible  will  be  reported  to  potential  network  users  by 
the  connecting  switching  node. 

It  is  the  researcher's  opinion  taat  a  SPLICE  LAM  is  seen 
as  just  another  subscriber  to  the  3DM.  This  being  the  case, 
there  seems  to  be  sufficient  justification  for  the  estab¬ 
lishment  cf  some  type  of  accounting  procedures  which  provide 
the  means  through  which  the  flow  of  packets  to  and  from  the 
DDK  can  be  monitored.  Assuming  some  type  of  accounting  will 
be  conducted  by  the  switciing  node,  the  connected  LAN  could 
obtain  accounting  information  from  it.  This  does  not 
provide  for  ar.y  type  of  cross  checking  of  the  switching 
node's  accounting  capability  or  accuracy.  This  then  estab¬ 
lishes  the  requirement  for  some  sort  cf  accounting 
procedures  to  be  established  in  the  LAN's  half  of  the 
gateway.  Currently,  public  packet  switching  networks  are 
using  procedures  which  account  for  subscriber  use  on  the 
basis  of  the  number  of  victual  circuits  established  during 
the  accounting  period  and  the  number  of  packets  sent  on  each 
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virtual  circuit  [Bef.  29:  p.  luOO],  Only  slight  modifica¬ 
tions  to  certain  reports  cecoasteniai  in  Chapter  3  would  be 
required  to  give  a  SPLICE  local  area  network  a  similar 
accounting  capability.  Finally,  and  most  important  of  all, 
is  that  the  accounting  aeohanisas  implemented  by  the  SPLICE 
LAN’s  be  based  upon  procedures  and  units  of  aaasure  iden¬ 
tical,  or  very  similar  to  those  utilized  by  the  DDK. 

G.  CHAPTER  SUHM  ART 

He  began  this  Chapter  with  a  discussion  of  various 
configurations  that  a  gateway  between  computer  networks 
could  assume.  The  autaor  feeLs  that  the  "two  half-gateway" 
concept  offers  the  simplest  and  most  effective  means  of 
interconnection.  The  discission  than  turned  to  the  problems 
associated  with  different  maximum  packet  sizes  utilized  by 
the  twc  interconnected  networks.  He  looked  at  the  situ¬ 
ations  when  the  LAN  maximum  packet  size  was  greater  than  the 
long  haul  network  maximun  packet  size  and  vice  versa.  In 
both  cases,  suggestions  were  made  as  to  how  this  problem 
could  be  handled.  A  discussion  of  flow  control  and 
congestion  control  techniques  was  tnen  entered  into.  This 
problem  was  approached  from  two  directions.  First,  control¬ 
ling  the  flaw  of  packets  into  the  LAN  half-gateway  for 
transmission  to  the  DDN.  And  second,  controlling  the  flow 
of  packets  from  the  Defense  Data  Network,  through  the 
gateway,  into  the  LAN.  The  proolems  of  internetwork 
addressing  and  component  naming  ware  then  considered.  The 
author  has  concluded  that  the  solution  to  the  first  problem 
would  be  the  establishment  of  a  standard  internetwork 
numbering  and  addressing  scheme.  The  standard  offered  was 
of  the  form,  NETWORK  ADDR23 S/LDCAL  HOST  ADDRESS.  The  compo¬ 
nent  naming  problem  was  found  to  be  best  handled  at  two 
levels.  The  translation  of  a  network  name  to  a  specific 


82 


network  addrass  would  be  conducted  at  th9  switching  node 
half-gateway,  while  the  translation  of  a  local  host  name  to 
a  local  host  addrass  would  be  accomplished  by  the  destina¬ 
tion  network  half-gateway.  Me  than  briefly  discussed  the 
topic  of  access  control.  There,  we  looked  at  the  rola 
played  by  an  access  controller,  and  atteapted  to  add  support 
for  it's  iapleaentation  in  the  SPLITE  network.  Finally,  we 
looked  at  the  need  for  failure  notification  and  accounting 
capabilities  associated  with  internetwork  traffic. 

Exclusive  of  the  interface  between  a  SPLITS  LAS  and  the 
DDS ,  the  monitoring  and  management  of  a  SPLITS  local  area 
network  is  predominantly  centralized.  Special  interface 
functions  such  as  those  described  Ln  this'  Chapter  require 
that  the  control  of  these  functions  be  distributed  to  the 
FEP .  Finally,  it  is  tie  researcher's  opinion  that  the 
management  of  the  LAN/DDM  interface  must  not  only  be  work¬ 
able,  but  must  be  acceptaole  from  ai  operational  standpoint 
by  the  users,  and  from  a  technical  aid  logical  standpoint  by 
network  operators. 
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The  integration  of  those  management  tools  iiscussed  in 
Chapters  1-5  is  accomplished  by  the  local  area  network 
central  monitoring  site  (Z MS).  It  is  here  where  measure¬ 
ments  and  statistics  are  collected,  performance  analysis 
conducted,  diagnostic  programs  and  recovery  actions  initi¬ 
ated,  network  utility  data  bases  updated,  and  where 
performance  parameter  adjustment  messages  originate.  This 
process  of  managing  from  a  central  location  minimizes  commu¬ 
nications  and  synchronization  difficulties,  and  helps  solve 
problems  that  may  otherwise  pass  unnoticed  [Ref.  33:  p.  21]. 

The  author  will  initially  presait  what  ha  feels  to  be 
the  mission  of  a  central  monitoring  site,  followed  by  appro¬ 
priately  supportive  objectives.  The  manning  requirements 
and  organizational  structure  associated  with  a  CMS  will  then 
be  discussed.  From  there,  a  discussion  of  a  network  opera¬ 
tor's  workbench  will  be  entered  into.  Finally,  a  discussion 
of  a  network  operator's  responsibilities  under  both  normal 
ar.d  failure  conditions  wiLL  be  presented. 

A.  MISSION  OF  A  LAN  MOTirORINS  SITE 

The  mission  of  a  LAN  central  monitoring  site  might  be 
stated  as  ,  *Po  insure  tie  most  efficient  and  effective  use 
of  network  resources  and  to  maximize  network  availability, 
throughput  and  responsiveness'.  Objectives  which  support 
the  accomplishment  of  this  mission  are: 

-  Keeping  track  of  the  status  and  configuration  of  the 
network. 

-  Detecting  alarm  conditions  and  failed  components. 

-  Carrying  out  fault  isolation  and  diagnostic  tests. 


-  Contacting  appropriate  repair  personnel  ani  monitoring 
repair  activities. 

-  Altering  the  physical  and  logical  network  configurations 
and  documenting  such  alterations. 

-  Adjusting  component  performance  parameters. 

-  Generating  management  reports. 

-  supporting  test  and  acceptance  activities. 

-  Provide  information  needed  for  planning  future  network 
evolution. 

-  Provide  a  historical  data  base  against  whicn  current  ani 
future  network  performance  may  be  measured. 

-  Monitor  component  utilization  throughout  the  network 
(e.g.  host,  communication  processor,  ani  shared  resources 
utilization)  . 

-  Perform  a  scheduling  function  for  application  programs 
requiring  use  of  the  host  processors. 

The  first  eight  objectives  are  similar  to  those  contain! 
in  the. Proaram  Plan  for  the  Defense  Data  network.  [Ref.  31: 
p.  142].  The  9th  and  10th  items  are  objectives  of  t  -  a 
Lawerer.ee  Livermore  Laboratory  Octopus  Netvort  Monitoring 
ani  Measuring  Project  [Ref.  34;  p.  2].  The  final  two  objec¬ 
tives  are  a  product  cf  the  author's  research. 

An  analysis  of  these  objectives  shows  that  the  tools 
discussed  in  this  paper  are  capable  of  accomplishing  the  1st 
through  the  7th  and  the  11th  oojectives  directly.  Although 
not  mentioned  as  yet,  tie  central  nonitorina  site  must  be 
able  to  support  the  testing,  evaluation  and  acceptance  of 
new  components  that  are  to  become  part  of  the  local  area 
network.  3y  establishing  the  iata  bases  iescribed  in 
Chapter  1,  we  ar eindirectly  supporting  the  accomplishment  of 
the  9th  and  10th  objectives.  The  establishment  of  data 
bases  which  reccri  network  performance  measures  and  compo¬ 
nent  utilization  statistics  would  greatly  enhance  our 
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ability  to  meet  these  objectives.  at  this  time,  the 
researcher  does  not  see  .1  need  for  the  design  of  a  sched¬ 
uling  algorithm  as  proposed  by  the  final  objective.  As 
applications  grow,  and  the  number  of  network  usees  increase, 
the  requirement  for  establishing  a  scheduling  algorithm  for 
the  purpose  of  efficiently  and  fairly  assigning  jobs  to  hose 
processors,  may  become  very  real.  An  example  of  where  a 
scheduling  algorithm  has  been  implemented  is  on  the  Los 
Alamos  Scientific  Laboratory  Integrated  Computer  Network 
[Ref.  35]. 


B.  HASHING  AND  ORGANIZATION  OP  A  LAN  CHS 

How  many  people  are  required  to  insure  the  continued  and 
efficient  operation  of  a  SPLICE  local  area  computer  network? 
Should  the  monitoring  site  be  manned  around  the  clock?  What 
organizational  aspects  must  be  considered  with  the  addition 
of  a  central  monitoring  site?  These  are  the  questions  that 
will  be  addressed  in  this  section.  during  the  discussion  of 
each,  a  possible  answer  will  be  recommended  by  the  author. 

The  manning  proposed  for  the  DDN  monitoring  centers 
range  from  four  people  at  the  system  monitoring  center,  from 
one  or  zero  at  other  centers  [Ref.  31:  p.  137],  The  manning 
of  the  NBSNET  (a  bus  oriented  local  computer  network)  meas¬ 
urement  center  calls  for  h  full-time  and  several  part-time 
computer-electrical  engineers  *  Ref .  36:  p.  13].  Neither  of 
these  manning  levels  seems  appropriate  for  a  SPLICE  LAN,  the 
DDN  being  a  much  larger  long  aaui  network,  and  the  NBSNET 
manning  level  reflecting  more  of  an  experimental  environ¬ 
ment.  A  local  area  netwock  with  a  structure  very  similar  to 
that  of  a  SPLICE  LAN  is  the  Hughes  Aircraft  Company  Janet 
network  [Ref.  37  3.  JANET  has  centralized  the  control  and 
monitoring  of  the  network  at  a  single  operator  position. 
From  this  position  the  operator  can  issue  commands  to  all 
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network  components,  perform  testing  and  performance  anal¬ 
ysis,  detect  and  diagnose  failures,  and  reconfigure  the. 
network.  The  degree  of  autoiation  recommended  throughout 
this  paper  would  provide  the  capabilities  reguired  for  a 
•one  person'  central  monitoring  site.  If  these  and  other 
lower  level  functions  are  not  autoaated,  the  possibility 
exists  that  an  additional  operator  may  be  reguired.  The 
author  feels  that  substantial  processing  will  occur,  and 
that  file  transfers  between  stock  Points  and  Inventory 
Control  Feints  will  take  place  after  normal  working  hours. 
For  this  reason,  it  is  felt  that  a  network  operator  should 
be  available  at  anytime  processing  is  in  progress.  After 
normal  working  hours,  this  position  may  be  filled  as  a 
collateral  duty  (i.e.  the  individual  filling  this  role  may 
also  be  responsible  for  one  of  the  host  processors) . 

In  answering  the  question  as  to  where  the  monitoring 
site  fits  into  the  organizational  picture,  one  must  remember 
that  the  CMS  can  exercise  a  great  deal  of  control  over  -.he 
network  and  it's  components.  This  being  the  case,  those 
individuals  comprising  the  CMS  must  work  directly  for  the 
'Director  of  the  LAN'.  ?e  would  not  want  the  central  moni¬ 
toring  site  to  come  under  the  control  of  one  of  it's  users 
or  under  the  control  of  one  of  the  staffs  associated  with  a 
network  host.  This  seems  to  add  more  justification  for 
establishing  the  position  of  'Director  of  the  LAN'.  This 
Director  could  operate  out  of  the  central  monitoring  site. 
From  here,  he  could  manage  and  control  the  operations  and 
resources  of  the  local  network,  formulate  network  policy, 
and  see  to  it's  implementation. 
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C.  A  NETWORK  OPERATOR'S  WORKBENCH 


A  network  operator's  workbench  is  a  single,  integrated 
system  containing  all  the  operator's  tools  in  one  place. 
The  system  oust  be  interactive  and,  because  new  analysis 
packages  and  models  will  be  continuously  developed,  possess 
the  characteristics  of  a  programmer's  workbench  ^Ref.  21:  p. 
4]. 

Certain  hardware  assets  # ill  enable  the  operator  to 
better  carry  out  the  network  management  function.  The 
terminal  or  terminals  utilized  by  the  CMS  snould  have  a 
fairly  extensive  graphics  capability.  For  example,  a 
display  of  the  entire  network  couli  be  put  on  the  screen 
with  different  colors  indicating  the  status  of  various 
components.  A  dedicated  printer  will  be  needed  for  manage¬ 
rial  reports,  but  more  importantly,  for  the  recording  of 
failure  messages  received  by  the  CMS.  Adequate  direct 
access  storage  will  also  be  a  necessity.  Additionally,  an 
alarm  capability  for  indicating  the  breeching  of  established 
parameter  threshholds  will  be  required. 

There  exists  numerous  software  tools  that  car.  be 
utilized  by  the  network  operator.  One  of  the  most  important 
is  a  good  DBMS  with  a  complete,  user  friendly  query 
language.  This  asset  will  allow  the  operator  to  investigate 
relationships  between  performance  measurements  and  associ¬ 
ated  parameters,  and  to  ask  exploratory  questions  concerning 
the  effect  of  certain  network  configurations  on  performance 
criteria.  The  possession  of  a  word  processing  capability 
will  also  assist  the  operator  in  the  performance  of  his 
duties.  An  additional  software  asset  is  the  actual  process 
through  which  the  operator  interacts  with  these  tools.  This 
interface  may  be  througn  a  Network  Operating  System  as 
currently  planned  for  the  DDN  [Ref.  11]  and  described  in 
[Rmf.  32].  Another  approach  to  this  problem  is  to  have  the 
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oparator  interact  with  tha  software  tools  through  an  appli¬ 
cation  program.  The  Hughes  Aircraft  Company  has  implemented 
a  Network  Monitor  Program  for  it’s  JANET  network  which  runs 
as  an  application  program  on  ona  of  the  host  processors 
[Raf.  37s  p.  96].  Tha  distributed  counterpart  of  this 
central  control  program  is  a  ’background'  program  included 
as  part  cf  the  adaptor  microcode.  It  is  through  these  back¬ 
ground  programs  that  the  Z MS  receives  certain  measurements 
and  failure  messages.  Additional  software  tools  that  will 
be  of  help  to  the  operator  include:  an  English  language  set 
of  commands  for  ease  of  system  oparation  and  network  diag¬ 
nostics,  default  parameter  vaLue  establishment  if 
unspecified  by  the  operator,  dynamic  control  programs  for 
adjusting  lower  level  performance  paramenters  iQ  accordance 
with  network  conditions,  and  finally  a  system  which  exists 
for  prompt  and  accurate  collection  of  any  data  the  user  may 
provide  on  a  problem. 

D.  OPERATORS  ACTIONS:  NORMAL  CONDITIONS 

In  the  next  two  sections  we  will  attempt  to  identify  the 
responsibilities  of  a  network  operator  under  both  normal  and 
abnormal  conditions.  They  are  presented  here  in  an  effort 
to  establish  a  basic  set  of  responsibilities  for  all  SPLICE 
LAN  operators.  This  section  deaLs  with  the  operator's 
responsibilities  under  normal  conditions.  It  is  realized  by 
the  author  that  some  of  these  responsibilities  may  also 
pertain  to  failure  conditions.  Finally,  it  is  not  known 
which,  if  any,  of  the  identified  responsibilities  will  be 
automated,  therefore,  the  discussion  of  responsibilities 
will  be  presented  as  if  the  operator  had  to  take  some 
specific  action  for  it's  accomplishment. 


89 


i-  lailiaiiaiisfi 

Assuming  the  network  operator  has  just  invoiced  the 
network  management  control  program,  there  are  certain  func¬ 
tions  that  must  be  arc omplishei.  The  operator  must 
establish  a  connection  with  the  'background'  program  in  the 
adaptor  or  component  interface.  Along  other  things,  this 
will  enable  him  to  find  out  just  who  is  on-line.  once 
connections  are  established,  the  operator  can  send  out 
instructions  to  the  nodes  providing  them  with  guidance  as  to 
what  measurements  to  taka,  when  to  send  them  to  the  CMS,  and 
upcoming  maintenance  acitvities.  Also  during  tnis  time,  the 
network  operator  obtains  the  physical  and  logical  configura¬ 
tion  tables  for  each  host  and  communication  processor,  which 
is  then  stored  in  the  a  global  network  configuration  table. 
During  initialization  the  network  operator  also  sets 
performance  parameter  values,  establishes  alarm  t hreshhclds 
for  performance  measurements,  identifies  critical  components 
which  he  is  specifically  interested  in  monitoring  and 
updates  the  Name/Address  Table  in  the  FEP  half-gateway. 

2.  Utility  Data  B§ses 

Information  obtained  from  each  component  about 
itself  and  it's  associated  peripherals  is  used  to  update  tho 
network  configuration  data  base.  This  provides  the  operator 
with  a  view  of  the  physical  state  of  the  network. 
Additionally,  logical  configuration  tables  can  be  estab¬ 
lished  for  each  component  and  user,  which  gives  them  their 
own  'customized'  configuration  of  the  natvovk.  Also  during 
this  time,  problem  management,  change  management  and 
performance  analysis  data  bases  may  be  opened  for  read/write 
and  checked  for  items  of  interest.  Finally,  the  operator 
needs  to  communicate  with  the  DDN  Monitoring  Center  who's 
area  of  influence  the  LAN  falls  within.  This  interaction 
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nay  simply  be  the  transfer  of  the  current  DDN  status  file  to 
the  CMS.  This  file  can  then  be  used  to  assist  users 
attempting  internetwork  coi municatioa . 

3.  2PSiai2£ls  aisjgUss 

The  network  operator  is  responsible  for  monitoring 
various  network  status  displays  and  in  some  cases  insuring 
their  availability  to  users.  These  status  displays  are 
created  from  data  obtained  from  configuration  and  problem 
management  data  bases  in  addition  to  results  of  performance 
analysis  and  component  monitoring.  ks  a  minimus,  the  status 
displays  that  should  exist  include:  a  global  network,  status 
display,  displays  for  eaca  major  ronponent  with  appropriate 
operating  information,  displays  for  any  desired  network 
performance  parameters  such  as  throughput  and  response  time, 
a  general  information  display  for  informing  users  of  sched¬ 
uled  maintenance,  DDN' s  status  and  administrative 
activities,  and  a  display  for  depicting  load  information  on 
hosts  and  communication  processors. 

4 .  Management  \ctiv  j-^ies 

In  addition  to  those  activities  mentioned  aoove,  the 
network  operator  is  also  responsible  for  the  acco mplishment 
of  other  normal  management  activities.  He  must  initiate 
monitoring  periods  for  tae  collection  of  measurement  data. 
Upon  the  completion  of  the  monitoring  period  he  must: 
control  the  transfer  of  data  from  the  adaptors  to  the  CMS, 
disable  adaptors  from  taking  additional  measurements,  and 
clear  adaptor  memory  contents  if  so  required. 

Utilizing  data  gathered,  aid  statistics  generated 
during  the  monitoring  period,  the  network  operator  must 
insure  the  appropriate  data  bases  are  updated.  This  may 
include  modifications  to  the  configuration,  problem  manage¬ 
ment,  and  performance  analysis  data  bases.  Information 
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obtained  during  the  monitoring  period  is  also  to  be  used  by 
the  network  operator  to  identify  trends,  look  for  bottle¬ 
necks  in  the  network  (especially  at  the  DD N/LAN  interface), 
conduct  network  performance  analysis,  and  prepare  network 
status  and  utilization  reports.  ifhile  analyzing  results  of 
the  aonitoring  period,  the  operator  lay  become  aware  of  some 
pending  component  failure.  If  so,  appropriate  action  is 
taken  to  diagnose  and  correct  the  failure.  Sore  specific 
action  to  be  taken  upon  failure  detection  will  be  discussed 
in  the  next  section. 

other  normal  management  activities  include  the  oper¬ 
ators  responsibility  to  test  all  adaptors  failure  detection 
and  diagnostic  capabilities,  the  distribution  of  new  soft¬ 
ware  versions,  adjustment  of  network  logical  and  physical 
configurations,  and  adjusting  performance  parameters  in 
order  to  tune  the  network.  The  network  operator  is  also 
responsible  for  informing  and  co  i  Minating  with  users 
planned  maintenance  activities.  Dae  final  responsibility 
calls  for  CMS  personnel  to  be  involved  in  the  installation, 
testing,  and  acceptance  of  equipment  that  is  going  to  become 
part  of  the  network. 

E.  OPERATORS  ACTIONS:  COMPONEST  FAILURE 

Having  utilized  the  performance  analysis,  and  problem 
detection  and  diagnosis  techniques  presented  earlier  in  this 
paper,  let  us  assume  the  network  operator  has  identified  a 
failed  component.  What  then  are  the  procedures  that  must  be 
followed  in  order  to  maaage  this  failure  until  it’s  rectifi¬ 
cation?  Assuming  the  failure  is  of  aajor  significance,  such 
as  a  down  communication  processor  ,  one  of  the  first  things 
the  operator  should  do  is  notify  the  network  of  the  failed 
component.  Concurrently,  configuration  tables,  the  problem 
management  data  base,  and  the  NameMddress  Table  in  the  FEP 
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should  be  updated.  Appropriate  entries  for  the  problem 
management  and  configuration  data  bases  are  shown  in 
Appendix  A  and  Appendix  B  respectfully.  Raving  done  this, 
the  operator  say  utilize  soae  fori  of  DDT  as  discussed  in 
Chapter  4,  in  a n  attempt  to  correct  or  further  isolate  the 
cause  of  failure.  If  this  fails,  the  operator  can  utilize 
the  information  that  has  been  recorded  in  the  network 
history  file  to  try  and  ’backup'  the  processor  to  a  point 
before  the  failure  occurs!  and  attempt  a  restart.  The  last 
chance  the  operator  has  to  correct  the  problem  is  cc  dump 
the  suspected  failure  causing  software  to  off-line  storage, 
and  reload  the  system  wita  a  fresh  copy  of  '■he  appropriate 
software.  Having  exhausted  his  means  of  problem  correction, 
the  operator  is  responsible  for  contacting  the  appropriate 
vendor. 

During  the  course  of  problem  identification  and  correc¬ 
tion,  it  is  required  that  the  network  remain  available  for 
customer  use.  To  do  this,  the  network  operator  must  have 
ths  capability  of  reconfiguring  the  network,  disable 
processing  of  local  operator  requests  so  that  he  is  in  full 
control  of  the  network,  activate  and  deactivate  a  components 
connection  to  the  bus,  and  transfer  functions  performed  by 
the  failed  component  to  another  device  capable  of  performing 
that  function.  Although  the  performance  of  the  network 
during  this  time  will  not  be  optimum,  it  will  at  least  be 
able  to  support  some  processing  requirements.  Upon  failure 
correction,  the  operator  is  responsible  for  bringing  the 
system  back  to  a  state  of  normal  operation.  This  would 
include  updating  the  appropriate  lata  bases,  returning  of 
functional  respc  nsiblities  as  required,  and  notifying  users 
of  the  resumption  of  normal  services. 
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F.  CHAPTER  SOM  HAH  I 

In  this  Chapter,  we  began  by  presenting  the  .mission  of  a 
network  central  monitoring  site  and  the  objectives  to  be  met 
in  order  to  fulfill  that  mission.  A  discussion  of  the 
manning  and  structural  aspects  of  a  CMS  was  then  entered 
into.  Attention  was  then  focused  on  the  description  of  a 
network  operators  workbench  and  its'  associated  tools.  Our 
final  discussion  dealt  wita  the  identification  of  a  network 
operator's  responsibilities  under  both  normal  and  failure 
conditions. 

It  is  the  author's  opinion  that  the  mission  and  objec¬ 
tives  presented  at  the  beginning  of  this  Chapter  provide  a 
complete  and  succinct  picture  of  exactly  why  a  network 
central  monitoring  site  exists  and  what  services  it  muse 
provide  for  the  network.  The  researcher  recommends  that  the 
monitoring  of  the  network  be  automated  to  a  point  such  that 
only  one  operator  and  his  staff  are  required  to  'control* 
the  network.  It  is  also  felt  chat  the  position  of  'Director 
of  the  LAN'  be  established  as  part  of  the  CMS  with  authority 
over  all  aspects  of  network  utilization.  The  tools  recom¬ 
mended  as  part  of  the  operator's  workbench  are  seen  as  the 
basis  upon  which  network  jonitoring,  control,  and  management 
will  be  conducted.  Without  them,  tas  accomplishment  of  the 
central  monitoring  site's  aission  will  be  questionable.  To 
conclude,  it  is  the  researcher's  opinion  that  the  network 
operator's  responsibilities  we  nave  identified  in  this 
Chapter,  although  not  ail  encompassing,  can  be  used  as  a 
basis  upen  which  extended  and  aore  specific  requirements  can 
be  built. 
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Type  of  equipment  and  serial  number 
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